Overview

Dataset statistics

Number of variables137
Number of observations36259
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory37.9 MiB
Average record size in memory1.1 KiB

Variable types

Numeric58
Categorical79

Alerts

SEQN is highly correlated with household_smokersHigh correlation
asthma is highly correlated with asthma_onset and 1 other fieldsHigh correlation
asthma_onset is highly correlated with asthma and 1 other fieldsHigh correlation
asthma_currently is highly correlated with asthma and 1 other fieldsHigh correlation
ever_overweight is highly correlated with BMIHigh correlation
arthritis is highly correlated with arthritis_onset and 1 other fieldsHigh correlation
heart_failure is highly correlated with heart_failure_onsetHigh correlation
heart_disease is highly correlated with heart_disease_onsetHigh correlation
angina is highly correlated with angina_onsetHigh correlation
heart_attack is highly correlated with heart_attack_onsetHigh correlation
stroke is highly correlated with stroke_onset and 1 other fieldsHigh correlation
emphysema is highly correlated with emphysema_onsetHigh correlation
bronchitis is highly correlated with bronchitis_currently and 1 other fieldsHigh correlation
liver_condition is highly correlated with liver_condition_currently and 1 other fieldsHigh correlation
thyroid_problem is highly correlated with thyroid_problem_currently and 1 other fieldsHigh correlation
bronchitis_currently is highly correlated with bronchitis and 1 other fieldsHigh correlation
liver_condition_currently is highly correlated with liver_condition and 1 other fieldsHigh correlation
thyroid_problem_currently is highly correlated with thyroid_problem and 1 other fieldsHigh correlation
cancer is highly correlated with first_cancer_countHigh correlation
arthritis_onset is highly correlated with arthritis and 1 other fieldsHigh correlation
heart_failure_onset is highly correlated with heart_failureHigh correlation
heart_disease_onset is highly correlated with heart_diseaseHigh correlation
angina_onset is highly correlated with anginaHigh correlation
heart_attack_onset is highly correlated with heart_attackHigh correlation
stroke_onset is highly correlated with stroke and 1 other fieldsHigh correlation
emphysema_onset is highly correlated with emphysemaHigh correlation
bronchitis_onset is highly correlated with bronchitis and 1 other fieldsHigh correlation
liver_condition_onset is highly correlated with liver_condition and 1 other fieldsHigh correlation
thyroid_problem_onset is highly correlated with thyroid_problem and 1 other fieldsHigh correlation
first_cancer_count is highly correlated with cancerHigh correlation
weight is highly correlated with BMIHigh correlation
BMI is highly correlated with ever_overweight and 1 other fieldsHigh correlation
systolic is highly correlated with diastolicHigh correlation
diastolic is highly correlated with systolicHigh correlation
albumin is highly correlated with calcium and 11 other fieldsHigh correlation
ALT is highly correlated with ASTHigh correlation
AST is highly correlated with ALTHigh correlation
ALP is highly correlated with calcium and 6 other fieldsHigh correlation
BUN is highly correlated with creatinine and 2 other fieldsHigh correlation
calcium is highly correlated with albumin and 12 other fieldsHigh correlation
CO2 is highly correlated with albumin and 10 other fieldsHigh correlation
creatinine is highly correlated with BUNHigh correlation
glucose is highly correlated with calcium and 3 other fieldsHigh correlation
iron is highly correlated with albuminHigh correlation
LHD is highly correlated with albumin and 10 other fieldsHigh correlation
phosphorus is highly correlated with albumin and 10 other fieldsHigh correlation
total_protein is highly correlated with albumin and 11 other fieldsHigh correlation
uric_acid is highly correlated with albumin and 11 other fieldsHigh correlation
sodium is highly correlated with albumin and 12 other fieldsHigh correlation
potassium is highly correlated with albumin and 13 other fieldsHigh correlation
chloride is highly correlated with albumin and 11 other fieldsHigh correlation
osmolality is highly correlated with albumin and 12 other fieldsHigh correlation
globulin is highly correlated with albumin and 11 other fieldsHigh correlation
cant_work is highly correlated with limited_workHigh correlation
limited_work is highly correlated with cant_workHigh correlation
walking_equipment is highly correlated with healthcare_equipmentHigh correlation
healthcare_equipment is highly correlated with walking_equipmentHigh correlation
health_problem_Arthritis is highly correlated with arthritis and 1 other fieldsHigh correlation
health_problem_Stroke is highly correlated with stroke and 1 other fieldsHigh correlation
cocaine_use is highly correlated with cocaine_number_uses and 1 other fieldsHigh correlation
cocaine_number_uses is highly correlated with cocaine_use and 2 other fieldsHigh correlation
heroine_use is highly correlated with inject_drugsHigh correlation
meth_use is highly correlated with cocaine_use and 2 other fieldsHigh correlation
meth_number_uses is highly correlated with cocaine_number_uses and 1 other fieldsHigh correlation
inject_drugs is highly correlated with heroine_useHigh correlation
start_smoking_age is highly correlated with current_smokerHigh correlation
current_smoker is highly correlated with start_smoking_age and 1 other fieldsHigh correlation
previous_cigarettes_per_day is highly correlated with days_quit_smokingHigh correlation
current_cigarettes_per_day is highly correlated with current_smokerHigh correlation
days_quit_smoking is highly correlated with previous_cigarettes_per_dayHigh correlation
household_smokers is highly correlated with SEQNHigh correlation
SEQN is highly correlated with household_smokersHigh correlation
gender is highly correlated with height and 1 other fieldsHigh correlation
asthma is highly correlated with asthma_onset and 1 other fieldsHigh correlation
asthma_onset is highly correlated with asthma and 1 other fieldsHigh correlation
asthma_currently is highly correlated with asthma and 1 other fieldsHigh correlation
ever_overweight is highly correlated with BMIHigh correlation
arthritis is highly correlated with arthritis_onset and 1 other fieldsHigh correlation
heart_failure is highly correlated with heart_failure_onsetHigh correlation
heart_disease is highly correlated with heart_disease_onsetHigh correlation
angina is highly correlated with angina_onsetHigh correlation
heart_attack is highly correlated with heart_attack_onsetHigh correlation
stroke is highly correlated with stroke_onset and 1 other fieldsHigh correlation
emphysema is highly correlated with emphysema_onsetHigh correlation
bronchitis is highly correlated with bronchitis_currently and 1 other fieldsHigh correlation
liver_condition is highly correlated with liver_condition_currently and 1 other fieldsHigh correlation
thyroid_problem is highly correlated with thyroid_problem_currently and 1 other fieldsHigh correlation
bronchitis_currently is highly correlated with bronchitis and 1 other fieldsHigh correlation
liver_condition_currently is highly correlated with liver_condition and 1 other fieldsHigh correlation
thyroid_problem_currently is highly correlated with thyroid_problem and 1 other fieldsHigh correlation
cancer is highly correlated with first_cancer_countHigh correlation
arthritis_onset is highly correlated with arthritis and 1 other fieldsHigh correlation
heart_failure_onset is highly correlated with heart_failureHigh correlation
heart_disease_onset is highly correlated with heart_diseaseHigh correlation
angina_onset is highly correlated with anginaHigh correlation
heart_attack_onset is highly correlated with heart_attackHigh correlation
stroke_onset is highly correlated with stroke and 1 other fieldsHigh correlation
emphysema_onset is highly correlated with emphysemaHigh correlation
bronchitis_onset is highly correlated with bronchitis and 1 other fieldsHigh correlation
liver_condition_onset is highly correlated with liver_condition and 1 other fieldsHigh correlation
thyroid_problem_onset is highly correlated with thyroid_problem and 1 other fieldsHigh correlation
first_cancer_count is highly correlated with cancerHigh correlation
weight is highly correlated with BMIHigh correlation
height is highly correlated with genderHigh correlation
BMI is highly correlated with ever_overweight and 1 other fieldsHigh correlation
systolic is highly correlated with diastolicHigh correlation
diastolic is highly correlated with systolicHigh correlation
albumin is highly correlated with calcium and 1 other fieldsHigh correlation
ALT is highly correlated with AST and 1 other fieldsHigh correlation
AST is highly correlated with ALTHigh correlation
BUN is highly correlated with creatinine and 1 other fieldsHigh correlation
calcium is highly correlated with albuminHigh correlation
creatinine is highly correlated with gender and 2 other fieldsHigh correlation
GGT is highly correlated with ALTHigh correlation
total_protein is highly correlated with albumin and 1 other fieldsHigh correlation
uric_acid is highly correlated with creatinineHigh correlation
sodium is highly correlated with osmolalityHigh correlation
osmolality is highly correlated with BUN and 1 other fieldsHigh correlation
globulin is highly correlated with total_proteinHigh correlation
drinks_per_occasion is highly correlated with drinks_past_yearHigh correlation
drinks_past_year is highly correlated with drinks_per_occasionHigh correlation
cant_work is highly correlated with limited_workHigh correlation
limited_work is highly correlated with cant_workHigh correlation
walking_equipment is highly correlated with healthcare_equipmentHigh correlation
healthcare_equipment is highly correlated with walking_equipmentHigh correlation
health_problem_Arthritis is highly correlated with arthritis and 1 other fieldsHigh correlation
health_problem_Stroke is highly correlated with stroke and 1 other fieldsHigh correlation
cocaine_use is highly correlated with cocaine_number_uses and 2 other fieldsHigh correlation
cocaine_number_uses is highly correlated with cocaine_use and 2 other fieldsHigh correlation
heroine_use is highly correlated with inject_drugsHigh correlation
meth_use is highly correlated with cocaine_use and 2 other fieldsHigh correlation
meth_number_uses is highly correlated with cocaine_use and 2 other fieldsHigh correlation
inject_drugs is highly correlated with heroine_useHigh correlation
start_smoking_age is highly correlated with current_smoker and 2 other fieldsHigh correlation
current_smoker is highly correlated with start_smoking_age and 1 other fieldsHigh correlation
previous_cigarettes_per_day is highly correlated with start_smoking_age and 1 other fieldsHigh correlation
current_cigarettes_per_day is highly correlated with current_smokerHigh correlation
days_quit_smoking is highly correlated with start_smoking_age and 1 other fieldsHigh correlation
household_smokers is highly correlated with SEQNHigh correlation
SEQN is highly correlated with household_smokersHigh correlation
gender is highly correlated with heightHigh correlation
asthma is highly correlated with asthma_onset and 1 other fieldsHigh correlation
asthma_onset is highly correlated with asthma and 1 other fieldsHigh correlation
asthma_currently is highly correlated with asthma and 1 other fieldsHigh correlation
arthritis is highly correlated with arthritis_onset and 1 other fieldsHigh correlation
heart_failure is highly correlated with heart_failure_onsetHigh correlation
heart_disease is highly correlated with heart_disease_onsetHigh correlation
angina is highly correlated with angina_onsetHigh correlation
heart_attack is highly correlated with heart_attack_onsetHigh correlation
stroke is highly correlated with stroke_onset and 1 other fieldsHigh correlation
emphysema is highly correlated with emphysema_onsetHigh correlation
bronchitis is highly correlated with bronchitis_currently and 1 other fieldsHigh correlation
liver_condition is highly correlated with liver_condition_currently and 1 other fieldsHigh correlation
thyroid_problem is highly correlated with thyroid_problem_currently and 1 other fieldsHigh correlation
bronchitis_currently is highly correlated with bronchitis and 1 other fieldsHigh correlation
liver_condition_currently is highly correlated with liver_condition and 1 other fieldsHigh correlation
thyroid_problem_currently is highly correlated with thyroid_problem and 1 other fieldsHigh correlation
cancer is highly correlated with first_cancer_countHigh correlation
arthritis_onset is highly correlated with arthritis and 1 other fieldsHigh correlation
heart_failure_onset is highly correlated with heart_failureHigh correlation
heart_disease_onset is highly correlated with heart_diseaseHigh correlation
angina_onset is highly correlated with anginaHigh correlation
heart_attack_onset is highly correlated with heart_attackHigh correlation
stroke_onset is highly correlated with stroke and 1 other fieldsHigh correlation
emphysema_onset is highly correlated with emphysemaHigh correlation
bronchitis_onset is highly correlated with bronchitis and 1 other fieldsHigh correlation
liver_condition_onset is highly correlated with liver_condition and 1 other fieldsHigh correlation
thyroid_problem_onset is highly correlated with thyroid_problem and 1 other fieldsHigh correlation
first_cancer_count is highly correlated with cancerHigh correlation
weight is highly correlated with BMIHigh correlation
height is highly correlated with genderHigh correlation
BMI is highly correlated with weightHigh correlation
ALT is highly correlated with ASTHigh correlation
AST is highly correlated with ALTHigh correlation
total_protein is highly correlated with globulinHigh correlation
sodium is highly correlated with osmolalityHigh correlation
osmolality is highly correlated with sodiumHigh correlation
globulin is highly correlated with total_proteinHigh correlation
drinks_per_occasion is highly correlated with drinks_past_yearHigh correlation
drinks_past_year is highly correlated with drinks_per_occasionHigh correlation
cant_work is highly correlated with limited_workHigh correlation
limited_work is highly correlated with cant_workHigh correlation
walking_equipment is highly correlated with healthcare_equipmentHigh correlation
healthcare_equipment is highly correlated with walking_equipmentHigh correlation
health_problem_Arthritis is highly correlated with arthritis and 1 other fieldsHigh correlation
health_problem_Stroke is highly correlated with stroke and 1 other fieldsHigh correlation
cocaine_use is highly correlated with cocaine_number_uses and 2 other fieldsHigh correlation
cocaine_number_uses is highly correlated with cocaine_use and 2 other fieldsHigh correlation
heroine_use is highly correlated with inject_drugsHigh correlation
meth_use is highly correlated with cocaine_use and 2 other fieldsHigh correlation
meth_number_uses is highly correlated with cocaine_use and 2 other fieldsHigh correlation
inject_drugs is highly correlated with heroine_useHigh correlation
start_smoking_age is highly correlated with current_smokerHigh correlation
current_smoker is highly correlated with start_smoking_age and 1 other fieldsHigh correlation
previous_cigarettes_per_day is highly correlated with days_quit_smokingHigh correlation
current_cigarettes_per_day is highly correlated with current_smokerHigh correlation
days_quit_smoking is highly correlated with previous_cigarettes_per_dayHigh correlation
household_smokers is highly correlated with SEQNHigh correlation
SEQN is highly correlated with lifetime_alcohol_consumption and 1 other fieldsHigh correlation
gender is highly correlated with uric_acidHigh correlation
age is highly correlated with marital_status and 5 other fieldsHigh correlation
race is highly correlated with birth_placeHigh correlation
citizenship is highly correlated with birth_placeHigh correlation
education_level is highly correlated with marital_statusHigh correlation
marital_status is highly correlated with age and 2 other fieldsHigh correlation
household_size is highly correlated with marital_statusHigh correlation
pregnant is highly correlated with age and 1 other fieldsHigh correlation
birth_place is highly correlated with race and 1 other fieldsHigh correlation
asthma is highly correlated with asthma_onset and 1 other fieldsHigh correlation
asthma_onset is highly correlated with asthma and 1 other fieldsHigh correlation
asthma_currently is highly correlated with asthma and 3 other fieldsHigh correlation
asthma_emergency is highly correlated with asthma_currentlyHigh correlation
ever_overweight is highly correlated with weight and 1 other fieldsHigh correlation
arthritis is highly correlated with age and 4 other fieldsHigh correlation
heart_failure is highly correlated with heart_attack and 2 other fieldsHigh correlation
heart_disease is highly correlated with angina and 4 other fieldsHigh correlation
angina is highly correlated with heart_disease and 1 other fieldsHigh correlation
heart_attack is highly correlated with heart_failure and 4 other fieldsHigh correlation
stroke is highly correlated with stroke_onset and 1 other fieldsHigh correlation
emphysema is highly correlated with emphysema_onsetHigh correlation
bronchitis is highly correlated with bronchitis_currently and 1 other fieldsHigh correlation
liver_condition is highly correlated with liver_condition_currently and 1 other fieldsHigh correlation
thyroid_problem is highly correlated with thyroid_problem_currently and 1 other fieldsHigh correlation
bronchitis_currently is highly correlated with bronchitis and 1 other fieldsHigh correlation
liver_condition_currently is highly correlated with liver_condition and 1 other fieldsHigh correlation
thyroid_problem_currently is highly correlated with thyroid_problem and 1 other fieldsHigh correlation
cancer is highly correlated with first_cancer_type and 1 other fieldsHigh correlation
first_cancer_type is highly correlated with cancer and 2 other fieldsHigh correlation
second_cancer_type is highly correlated with first_cancer_type and 3 other fieldsHigh correlation
third_cancer_type is highly correlated with second_cancer_type and 3 other fieldsHigh correlation
fourth_cancer_count is highly correlated with third_cancer_typeHigh correlation
hay_fever is highly correlated with asthma_currentlyHigh correlation
arthritis_onset is highly correlated with age and 3 other fieldsHigh correlation
heart_failure_onset is highly correlated with heart_failure and 4 other fieldsHigh correlation
heart_disease_onset is highly correlated with heart_disease and 4 other fieldsHigh correlation
angina_onset is highly correlated with angina and 3 other fieldsHigh correlation
heart_attack_onset is highly correlated with heart_disease and 4 other fieldsHigh correlation
stroke_onset is highly correlated with stroke and 1 other fieldsHigh correlation
emphysema_onset is highly correlated with emphysemaHigh correlation
bronchitis_onset is highly correlated with bronchitis and 1 other fieldsHigh correlation
liver_condition_onset is highly correlated with liver_condition and 1 other fieldsHigh correlation
thyroid_problem_onset is highly correlated with thyroid_problem and 1 other fieldsHigh correlation
arthritis_type is highly correlated with arthritis and 1 other fieldsHigh correlation
first_cancer_count is highly correlated with cancer and 1 other fieldsHigh correlation
second_cancer_count is highly correlated with second_cancer_typeHigh correlation
third_cancer_count is highly correlated with second_cancer_type and 1 other fieldsHigh correlation
weight is highly correlated with ever_overweight and 2 other fieldsHigh correlation
height is highly correlated with weight and 1 other fieldsHigh correlation
BMI is highly correlated with ever_overweight and 2 other fieldsHigh correlation
pulse is highly correlated with systolicHigh correlation
systolic is highly correlated with pulse and 1 other fieldsHigh correlation
diastolic is highly correlated with systolicHigh correlation
albumin is highly correlated with pregnant and 10 other fieldsHigh correlation
ALT is highly correlated with ASTHigh correlation
AST is highly correlated with ALT and 2 other fieldsHigh correlation
ALP is highly correlated with third_cancer_typeHigh correlation
BUN is highly correlated with creatinine and 3 other fieldsHigh correlation
calcium is highly correlated with albumin and 9 other fieldsHigh correlation
CO2 is highly correlated with albumin and 9 other fieldsHigh correlation
creatinine is highly correlated with BUN and 1 other fieldsHigh correlation
GGT is highly correlated with ASTHigh correlation
glucose is highly correlated with phosphorus and 5 other fieldsHigh correlation
iron is highly correlated with uric_acid and 4 other fieldsHigh correlation
LHD is highly correlated with ASTHigh correlation
phosphorus is highly correlated with albumin and 12 other fieldsHigh correlation
total_protein is highly correlated with albumin and 9 other fieldsHigh correlation
uric_acid is highly correlated with gender and 13 other fieldsHigh correlation
sodium is highly correlated with albumin and 11 other fieldsHigh correlation
potassium is highly correlated with albumin and 9 other fieldsHigh correlation
chloride is highly correlated with albumin and 11 other fieldsHigh correlation
osmolality is highly correlated with albumin and 12 other fieldsHigh correlation
globulin is highly correlated with albumin and 11 other fieldsHigh correlation
vigorous_recreation is highly correlated with moderate_recreationHigh correlation
moderate_recreation is highly correlated with vigorous_recreationHigh correlation
vigorous_work is highly correlated with moderate_workHigh correlation
moderate_work is highly correlated with vigorous_workHigh correlation
lifetime_alcohol_consumption is highly correlated with SEQNHigh correlation
cant_work is highly correlated with limited_work and 4 other fieldsHigh correlation
limited_work is highly correlated with arthritis and 6 other fieldsHigh correlation
walking_equipment is highly correlated with cant_work and 3 other fieldsHigh correlation
memory_problems is highly correlated with cant_work and 1 other fieldsHigh correlation
healthcare_equipment is highly correlated with cant_work and 2 other fieldsHigh correlation
health_problem_Back or Neck is highly correlated with cant_work and 2 other fieldsHigh correlation
health_problem_Arthritis is highly correlated with age and 5 other fieldsHigh correlation
health_problem_Stroke is highly correlated with stroke and 1 other fieldsHigh correlation
health_problem_Blood Pressure is highly correlated with health_problem_DiabetesHigh correlation
health_problem_Heart is highly correlated with heart_failure and 3 other fieldsHigh correlation
health_problem_Diabetes is highly correlated with health_problem_Blood PressureHigh correlation
marijuana_use is highly correlated with age and 1 other fieldsHigh correlation
cocaine_use is highly correlated with marijuana_use and 4 other fieldsHigh correlation
cocaine_number_uses is highly correlated with cocaine_use and 1 other fieldsHigh correlation
heroine_use is highly correlated with cocaine_use and 1 other fieldsHigh correlation
meth_use is highly correlated with cocaine_use and 2 other fieldsHigh correlation
meth_number_uses is highly correlated with cocaine_number_uses and 1 other fieldsHigh correlation
inject_drugs is highly correlated with heroine_use and 1 other fieldsHigh correlation
rehab_program is highly correlated with cocaine_useHigh correlation
start_smoking_age is highly correlated with current_smokerHigh correlation
current_smoker is highly correlated with start_smoking_age and 3 other fieldsHigh correlation
previous_cigarettes_per_day is highly correlated with current_smoker and 1 other fieldsHigh correlation
current_cigarettes_per_day is highly correlated with current_smoker and 1 other fieldsHigh correlation
days_quit_smoking is highly correlated with current_smoker and 1 other fieldsHigh correlation
household_smokers is highly correlated with SEQN and 1 other fieldsHigh correlation
third_cancer_type is highly correlated with fourth_cancer_count and 1 other fieldsHigh correlation
limited_work is highly correlated with cant_workHigh correlation
arthritis is highly correlated with health_problem_Arthritis and 1 other fieldsHigh correlation
first_cancer_count is highly correlated with cancer and 1 other fieldsHigh correlation
cant_work is highly correlated with limited_workHigh correlation
bronchitis is highly correlated with bronchitis_currentlyHigh correlation
health_problem_Stroke is highly correlated with strokeHigh correlation
asthma is highly correlated with asthma_currentlyHigh correlation
second_cancer_type is highly correlated with second_cancer_countHigh correlation
fourth_cancer_count is highly correlated with third_cancer_typeHigh correlation
birth_place is highly correlated with citizenshipHigh correlation
bronchitis_currently is highly correlated with bronchitisHigh correlation
third_cancer_count is highly correlated with third_cancer_typeHigh correlation
thyroid_problem_currently is highly correlated with thyroid_problemHigh correlation
liver_condition is highly correlated with liver_condition_currentlyHigh correlation
liver_condition_currently is highly correlated with liver_conditionHigh correlation
health_problem_Arthritis is highly correlated with arthritisHigh correlation
arthritis_type is highly correlated with arthritisHigh correlation
walking_equipment is highly correlated with healthcare_equipmentHigh correlation
cancer is highly correlated with first_cancer_count and 1 other fieldsHigh correlation
asthma_currently is highly correlated with asthmaHigh correlation
first_cancer_type is highly correlated with first_cancer_count and 1 other fieldsHigh correlation
cocaine_use is highly correlated with meth_useHigh correlation
thyroid_problem is highly correlated with thyroid_problem_currentlyHigh correlation
gender is highly correlated with pregnantHigh correlation
stroke is highly correlated with health_problem_StrokeHigh correlation
pregnant is highly correlated with genderHigh correlation
second_cancer_count is highly correlated with second_cancer_typeHigh correlation
meth_use is highly correlated with cocaine_useHigh correlation
heroine_use is highly correlated with inject_drugsHigh correlation
inject_drugs is highly correlated with heroine_useHigh correlation
citizenship is highly correlated with birth_placeHigh correlation
healthcare_equipment is highly correlated with walking_equipmentHigh correlation
cocaine_per_month is highly skewed (γ1 = 49.82114522) Skewed
heronine_per_month is highly skewed (γ1 = 38.06270681) Skewed
meth_per_month is highly skewed (γ1 = 31.76947474) Skewed
SEQN has unique values Unique
education_level has 2157 (5.9%) zeros Zeros
household_income has 1813 (5.0%) zeros Zeros
asthma_onset has 30981 (85.4%) zeros Zeros
arthritis_onset has 27002 (74.5%) zeros Zeros
heart_failure_onset has 35164 (97.0%) zeros Zeros
heart_disease_onset has 34890 (96.2%) zeros Zeros
angina_onset has 35387 (97.6%) zeros Zeros
heart_attack_onset has 34837 (96.1%) zeros Zeros
stroke_onset has 35000 (96.5%) zeros Zeros
emphysema_onset has 35566 (98.1%) zeros Zeros
bronchitis_onset has 34295 (94.6%) zeros Zeros
liver_condition_onset has 34879 (96.2%) zeros Zeros
thyroid_problem_onset has 32795 (90.4%) zeros Zeros
cancer_onset has 35750 (98.6%) zeros Zeros
BMI has 386 (1.1%) zeros Zeros
pulse has 728 (2.0%) zeros Zeros
systolic has 2530 (7.0%) zeros Zeros
diastolic has 2724 (7.5%) zeros Zeros
albumin has 1915 (5.3%) zeros Zeros
ALT has 1999 (5.5%) zeros Zeros
AST has 2016 (5.6%) zeros Zeros
ALP has 1923 (5.3%) zeros Zeros
BUN has 1921 (5.3%) zeros Zeros
calcium has 1950 (5.4%) zeros Zeros
CO2 has 1992 (5.5%) zeros Zeros
creatinine has 1917 (5.3%) zeros Zeros
GGT has 1923 (5.3%) zeros Zeros
glucose has 1917 (5.3%) zeros Zeros
iron has 1955 (5.4%) zeros Zeros
LHD has 2106 (5.8%) zeros Zeros
phosphorus has 1923 (5.3%) zeros Zeros
bilirubin has 1939 (5.3%) zeros Zeros
total_protein has 1966 (5.4%) zeros Zeros
uric_acid has 1927 (5.3%) zeros Zeros
sodium has 1918 (5.3%) zeros Zeros
potassium has 1924 (5.3%) zeros Zeros
chloride has 1918 (5.3%) zeros Zeros
osmolality has 1923 (5.3%) zeros Zeros
globulin has 1967 (5.4%) zeros Zeros
drinks_per_occasion has 15602 (43.0%) zeros Zeros
drinks_past_year has 15602 (43.0%) zeros Zeros
marijuana_per_month has 32756 (90.3%) zeros Zeros
cocaine_number_uses has 32310 (89.1%) zeros Zeros
cocaine_per_month has 35837 (98.8%) zeros Zeros
heronine_per_month has 36200 (99.8%) zeros Zeros
meth_number_uses has 34660 (95.6%) zeros Zeros
meth_per_month has 36111 (99.6%) zeros Zeros
start_smoking_age has 21426 (59.1%) zeros Zeros
previous_cigarettes_per_day has 28368 (78.2%) zeros Zeros
current_cigarettes_per_day has 28822 (79.5%) zeros Zeros
days_quit_smoking has 27934 (77.0%) zeros Zeros

Reproduction

Analysis started2022-05-08 15:21:47.340180
Analysis finished2022-05-08 15:23:07.548546
Duration1 minute and 20.21 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

SEQN
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct36259
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67521.5638
Minimum31131
Maximum102956
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:07.612560image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum31131
5-th percentile35125.6
Q149463
median66902
Q385901.5
95-th percentile99540.1
Maximum102956
Range71825
Interquartile range (IQR)36438.5

Descriptive statistics

Standard deviation20832.29403
Coefficient of variation (CV)0.3085280147
Kurtosis-1.234354549
Mean67521.5638
Median Absolute Deviation (MAD)18208
Skewness0.001381509476
Sum2448264382
Variance433984474.5
MonotonicityStrictly increasing
2022-05-08T17:23:07.734197image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
311311
 
< 0.1%
801811
 
< 0.1%
801681
 
< 0.1%
801691
 
< 0.1%
801701
 
< 0.1%
801751
 
< 0.1%
801781
 
< 0.1%
801791
 
< 0.1%
801831
 
< 0.1%
801641
 
< 0.1%
Other values (36249)36249
> 99.9%
ValueCountFrequency (%)
311311
< 0.1%
311321
< 0.1%
311341
< 0.1%
311391
< 0.1%
311431
< 0.1%
311441
< 0.1%
311491
< 0.1%
311501
< 0.1%
311511
< 0.1%
311521
< 0.1%
ValueCountFrequency (%)
1029561
< 0.1%
1029541
< 0.1%
1029531
< 0.1%
1029521
< 0.1%
1029491
< 0.1%
1029471
< 0.1%
1029441
< 0.1%
1029431
< 0.1%
1029351
< 0.1%
1029341
< 0.1%

depression
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
Not Depressed
33091 
Depressed
 
3168

Length

Max length13
Median length13
Mean length12.65051436
Min length9

Characters and Unicode

Total characters458695
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot Depressed
2nd rowNot Depressed
3rd rowNot Depressed
4th rowNot Depressed
5th rowNot Depressed

Common Values

ValueCountFrequency (%)
Not Depressed33091
91.3%
Depressed3168
 
8.7%

Length

2022-05-08T17:23:07.853225image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:07.951246image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
depressed36259
52.3%
not33091
47.7%

Most occurring characters

ValueCountFrequency (%)
e108777
23.7%
s72518
15.8%
D36259
 
7.9%
p36259
 
7.9%
r36259
 
7.9%
d36259
 
7.9%
N33091
 
7.2%
o33091
 
7.2%
t33091
 
7.2%
33091
 
7.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter356254
77.7%
Uppercase Letter69350
 
15.1%
Space Separator33091
 
7.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e108777
30.5%
s72518
20.4%
p36259
 
10.2%
r36259
 
10.2%
d36259
 
10.2%
o33091
 
9.3%
t33091
 
9.3%
Uppercase Letter
ValueCountFrequency (%)
D36259
52.3%
N33091
47.7%
Space Separator
ValueCountFrequency (%)
33091
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin425604
92.8%
Common33091
 
7.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e108777
25.6%
s72518
17.0%
D36259
 
8.5%
p36259
 
8.5%
r36259
 
8.5%
d36259
 
8.5%
N33091
 
7.8%
o33091
 
7.8%
t33091
 
7.8%
Common
ValueCountFrequency (%)
33091
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII458695
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e108777
23.7%
s72518
15.8%
D36259
 
7.9%
p36259
 
7.9%
r36259
 
7.9%
d36259
 
7.9%
N33091
 
7.2%
o33091
 
7.2%
t33091
 
7.2%
33091
 
7.2%

gender
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
1
18447 
0
17812 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters36259
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
118447
50.9%
017812
49.1%

Length

2022-05-08T17:23:08.029264image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:08.116283image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
118447
50.9%
017812
49.1%

Most occurring characters

ValueCountFrequency (%)
118447
50.9%
017812
49.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number36259
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
118447
50.9%
017812
49.1%

Most occurring scripts

ValueCountFrequency (%)
Common36259
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
118447
50.9%
017812
49.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII36259
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
118447
50.9%
017812
49.1%

age
Real number (ℝ≥0)

HIGH CORRELATION

Distinct68
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47.78625996
Minimum18
Maximum85
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:08.205898image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile19
Q131
median47
Q363
95-th percentile80
Maximum85
Range67
Interquartile range (IQR)32

Descriptive statistics

Standard deviation18.76397783
Coefficient of variation (CV)0.3926647083
Kurtosis-1.141364256
Mean47.78625996
Median Absolute Deviation (MAD)16
Skewness0.08984870427
Sum1732682
Variance352.086864
MonotonicityNot monotonic
2022-05-08T17:23:08.320924image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
801829
 
5.0%
181076
 
3.0%
191055
 
2.9%
60774
 
2.1%
61725
 
2.0%
62685
 
1.9%
63652
 
1.8%
22631
 
1.7%
23617
 
1.7%
40613
 
1.7%
Other values (58)27602
76.1%
ValueCountFrequency (%)
181076
3.0%
191055
2.9%
20584
1.6%
21570
1.6%
22631
1.7%
23617
1.7%
24588
1.6%
25579
1.6%
26555
1.5%
27533
1.5%
ValueCountFrequency (%)
85101
 
0.3%
8424
 
0.1%
8334
 
0.1%
8235
 
0.1%
8145
 
0.1%
801829
5.0%
79245
 
0.7%
78292
 
0.8%
77276
 
0.8%
76311
 
0.9%

race
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
White
15179 
Black
7930 
Mexican
5874 
Other and Multiracial
3848 
Other Hispanic
3428 

Length

Max length21
Median length5
Mean length7.872886732
Min length5

Characters and Unicode

Total characters285463
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBlack
2nd rowWhite
3rd rowWhite
4th rowOther Hispanic
5th rowWhite

Common Values

ValueCountFrequency (%)
White15179
41.9%
Black7930
21.9%
Mexican5874
 
16.2%
Other and Multiracial3848
 
10.6%
Other Hispanic3428
 
9.5%

Length

2022-05-08T17:23:08.436950image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:08.542974image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
white15179
32.0%
black7930
16.7%
other7276
15.4%
mexican5874
 
12.4%
and3848
 
8.1%
multiracial3848
 
8.1%
hispanic3428
 
7.2%

Most occurring characters

ValueCountFrequency (%)
i35605
12.5%
a28776
10.1%
e28329
9.9%
t26303
 
9.2%
h22455
 
7.9%
c21080
 
7.4%
l15626
 
5.5%
W15179
 
5.3%
n13150
 
4.6%
11124
 
3.9%
Other values (11)67836
23.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter230804
80.9%
Uppercase Letter43535
 
15.3%
Space Separator11124
 
3.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i35605
15.4%
a28776
12.5%
e28329
12.3%
t26303
11.4%
h22455
9.7%
c21080
9.1%
l15626
6.8%
n13150
 
5.7%
r11124
 
4.8%
k7930
 
3.4%
Other values (5)20426
8.8%
Uppercase Letter
ValueCountFrequency (%)
W15179
34.9%
M9722
22.3%
B7930
18.2%
O7276
16.7%
H3428
 
7.9%
Space Separator
ValueCountFrequency (%)
11124
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin274339
96.1%
Common11124
 
3.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
i35605
13.0%
a28776
10.5%
e28329
10.3%
t26303
9.6%
h22455
 
8.2%
c21080
 
7.7%
l15626
 
5.7%
W15179
 
5.5%
n13150
 
4.8%
r11124
 
4.1%
Other values (10)56712
20.7%
Common
ValueCountFrequency (%)
11124
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII285463
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i35605
12.5%
a28776
10.1%
e28329
9.9%
t26303
 
9.2%
h22455
 
7.9%
c21080
 
7.4%
l15626
 
5.5%
W15179
 
5.3%
n13150
 
4.6%
11124
 
3.9%
Other values (11)67836
23.8%

citizenship
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
Citizen
31238 
Not Citizen
4952 
Missing
 
69

Length

Max length11
Median length7
Mean length7.546291955
Min length7

Characters and Unicode

Total characters273621
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCitizen
2nd rowCitizen
3rd rowCitizen
4th rowCitizen
5th rowCitizen

Common Values

ValueCountFrequency (%)
Citizen31238
86.2%
Not Citizen4952
 
13.7%
Missing69
 
0.2%

Length

2022-05-08T17:23:08.647998image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:08.749020image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
citizen36190
87.8%
not4952
 
12.0%
missing69
 
0.2%

Most occurring characters

ValueCountFrequency (%)
i72518
26.5%
t41142
15.0%
n36259
13.3%
C36190
13.2%
z36190
13.2%
e36190
13.2%
N4952
 
1.8%
o4952
 
1.8%
4952
 
1.8%
s138
 
0.1%
Other values (2)138
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter227458
83.1%
Uppercase Letter41211
 
15.1%
Space Separator4952
 
1.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i72518
31.9%
t41142
18.1%
n36259
15.9%
z36190
15.9%
e36190
15.9%
o4952
 
2.2%
s138
 
0.1%
g69
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
C36190
87.8%
N4952
 
12.0%
M69
 
0.2%
Space Separator
ValueCountFrequency (%)
4952
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin268669
98.2%
Common4952
 
1.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
i72518
27.0%
t41142
15.3%
n36259
13.5%
C36190
13.5%
z36190
13.5%
e36190
13.5%
N4952
 
1.8%
o4952
 
1.8%
s138
 
0.1%
M69
 
< 0.1%
Common
ValueCountFrequency (%)
4952
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII273621
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i72518
26.5%
t41142
15.0%
n36259
13.3%
C36190
13.2%
z36190
13.2%
e36190
13.2%
N4952
 
1.8%
o4952
 
1.8%
4952
 
1.8%
s138
 
0.1%
Other values (2)138
 
0.1%

education_level
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.209106705
Minimum0
Maximum5
Zeros2157
Zeros (%)5.9%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:08.818624image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median3
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.464931537
Coefficient of variation (CV)0.4564919997
Kurtosis-0.5719330285
Mean3.209106705
Median Absolute Deviation (MAD)1
Skewness-0.5887800903
Sum116359
Variance2.146024407
MonotonicityNot monotonic
2022-05-08T17:23:08.893640image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
410136
28.0%
37887
21.8%
57813
21.5%
24823
13.3%
13443
 
9.5%
02157
 
5.9%
ValueCountFrequency (%)
02157
 
5.9%
13443
 
9.5%
24823
13.3%
37887
21.8%
410136
28.0%
57813
21.5%
ValueCountFrequency (%)
57813
21.5%
410136
28.0%
37887
21.8%
24823
13.3%
13443
 
9.5%
02157
 
5.9%

marital_status
Categorical

HIGH CORRELATION

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
Married
17574 
Never Married
6602 
Divorced
3751 
Partner
2842 
Widowed
2695 
Other values (2)
2795 

Length

Max length13
Median length7
Mean length8.259852726
Min length7

Characters and Unicode

Total characters299494
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMarried
2nd rowMarried
3rd rowMarried
4th rowNever Married
5th rowNever Married

Common Values

ValueCountFrequency (%)
Married17574
48.5%
Never Married6602
 
18.2%
Divorced3751
 
10.3%
Partner2842
 
7.8%
Widowed2695
 
7.4%
Missing1636
 
4.5%
Separated1159
 
3.2%

Length

2022-05-08T17:23:08.979660image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:09.083683image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
married24176
56.4%
never6602
 
15.4%
divorced3751
 
8.8%
partner2842
 
6.6%
widowed2695
 
6.3%
missing1636
 
3.8%
separated1159
 
2.7%

Most occurring characters

ValueCountFrequency (%)
r65548
21.9%
e48986
16.4%
d34476
11.5%
i33894
11.3%
a29336
9.8%
M25812
 
8.6%
v10353
 
3.5%
N6602
 
2.2%
6602
 
2.2%
o6446
 
2.2%
Other values (11)31439
10.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter250031
83.5%
Uppercase Letter42861
 
14.3%
Space Separator6602
 
2.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r65548
26.2%
e48986
19.6%
d34476
13.8%
i33894
13.6%
a29336
11.7%
v10353
 
4.1%
o6446
 
2.6%
n4478
 
1.8%
t4001
 
1.6%
c3751
 
1.5%
Other values (4)8762
 
3.5%
Uppercase Letter
ValueCountFrequency (%)
M25812
60.2%
N6602
 
15.4%
D3751
 
8.8%
P2842
 
6.6%
W2695
 
6.3%
S1159
 
2.7%
Space Separator
ValueCountFrequency (%)
6602
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin292892
97.8%
Common6602
 
2.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
r65548
22.4%
e48986
16.7%
d34476
11.8%
i33894
11.6%
a29336
10.0%
M25812
 
8.8%
v10353
 
3.5%
N6602
 
2.3%
o6446
 
2.2%
n4478
 
1.5%
Other values (10)26961
9.2%
Common
ValueCountFrequency (%)
6602
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII299494
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r65548
21.9%
e48986
16.4%
d34476
11.5%
i33894
11.3%
a29336
9.8%
M25812
 
8.6%
v10353
 
3.5%
N6602
 
2.2%
6602
 
2.2%
o6446
 
2.2%
Other values (11)31439
10.5%

household_size
Real number (ℝ≥0)

HIGH CORRELATION

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.206348769
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:09.172702image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.677449524
Coefficient of variation (CV)0.523165022
Kurtosis-0.439337375
Mean3.206348769
Median Absolute Deviation (MAD)1
Skewness0.6473513447
Sum116259
Variance2.813836905
MonotonicityNot monotonic
2022-05-08T17:23:09.241718image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
210769
29.7%
36607
18.2%
45905
16.3%
14945
13.6%
53929
 
10.8%
72066
 
5.7%
62038
 
5.6%
ValueCountFrequency (%)
14945
13.6%
210769
29.7%
36607
18.2%
45905
16.3%
53929
 
10.8%
62038
 
5.6%
72066
 
5.7%
ValueCountFrequency (%)
72066
 
5.7%
62038
 
5.6%
53929
 
10.8%
45905
16.3%
36607
18.2%
210769
29.7%
14945
13.6%

pregnant
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
Missing
28366 
No
7237 
Yes
 
656

Length

Max length7
Median length7
Mean length5.929672633
Min length2

Characters and Unicode

Total characters215004
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowMissing
3rd rowMissing
4th rowNo
5th rowMissing

Common Values

ValueCountFrequency (%)
Missing28366
78.2%
No7237
 
20.0%
Yes656
 
1.8%

Length

2022-05-08T17:23:09.327737image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:09.420758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
missing28366
78.2%
no7237
 
20.0%
yes656
 
1.8%

Most occurring characters

ValueCountFrequency (%)
s57388
26.7%
i56732
26.4%
M28366
13.2%
n28366
13.2%
g28366
13.2%
N7237
 
3.4%
o7237
 
3.4%
Y656
 
0.3%
e656
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter178745
83.1%
Uppercase Letter36259
 
16.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s57388
32.1%
i56732
31.7%
n28366
15.9%
g28366
15.9%
o7237
 
4.0%
e656
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
M28366
78.2%
N7237
 
20.0%
Y656
 
1.8%

Most occurring scripts

ValueCountFrequency (%)
Latin215004
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s57388
26.7%
i56732
26.4%
M28366
13.2%
n28366
13.2%
g28366
13.2%
N7237
 
3.4%
o7237
 
3.4%
Y656
 
0.3%
e656
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII215004
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s57388
26.7%
i56732
26.4%
M28366
13.2%
n28366
13.2%
g28366
13.2%
N7237
 
3.4%
o7237
 
3.4%
Y656
 
0.3%
e656
 
0.3%

birth_place
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
USA
26624 
Mexico
7704 
Other Spanish Country
 
865
Other Non Spanish Country
 
690
Elsewhere
 
362

Length

Max length25
Median length3
Mean length4.54692628
Min length3

Characters and Unicode

Total characters164867
Distinct characters26
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUSA
2nd rowUSA
3rd rowUSA
4th rowUSA
5th rowUSA

Common Values

ValueCountFrequency (%)
USA26624
73.4%
Mexico7704
 
21.2%
Other Spanish Country865
 
2.4%
Other Non Spanish Country690
 
1.9%
Elsewhere362
 
1.0%
Missing14
 
< 0.1%

Length

2022-05-08T17:23:09.503777image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:09.604799image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
usa26624
66.5%
mexico7704
 
19.2%
other1555
 
3.9%
spanish1555
 
3.9%
country1555
 
3.9%
non690
 
1.7%
elsewhere362
 
0.9%
missing14
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
S28179
17.1%
U26624
16.1%
A26624
16.1%
e10345
 
6.3%
o9949
 
6.0%
i9287
 
5.6%
M7718
 
4.7%
x7704
 
4.7%
c7704
 
4.7%
n3814
 
2.3%
Other values (16)26919
16.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter93307
56.6%
Lowercase Letter67760
41.1%
Space Separator3800
 
2.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e10345
15.3%
o9949
14.7%
i9287
13.7%
x7704
11.4%
c7704
11.4%
n3814
 
5.6%
h3472
 
5.1%
r3472
 
5.1%
t3110
 
4.6%
s1945
 
2.9%
Other values (7)6958
10.3%
Uppercase Letter
ValueCountFrequency (%)
S28179
30.2%
U26624
28.5%
A26624
28.5%
M7718
 
8.3%
O1555
 
1.7%
C1555
 
1.7%
N690
 
0.7%
E362
 
0.4%
Space Separator
ValueCountFrequency (%)
3800
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin161067
97.7%
Common3800
 
2.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
S28179
17.5%
U26624
16.5%
A26624
16.5%
e10345
 
6.4%
o9949
 
6.2%
i9287
 
5.8%
M7718
 
4.8%
x7704
 
4.8%
c7704
 
4.8%
n3814
 
2.4%
Other values (15)23119
14.4%
Common
ValueCountFrequency (%)
3800
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII164867
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S28179
17.1%
U26624
16.1%
A26624
16.1%
e10345
 
6.3%
o9949
 
6.0%
i9287
 
5.6%
M7718
 
4.7%
x7704
 
4.7%
c7704
 
4.7%
n3814
 
2.3%
Other values (16)26919
16.3%

veteran
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
No
32325 
Yes
3929 
Missing
 
5

Length

Max length7
Median length2
Mean length2.109048788
Min length2

Characters and Unicode

Total characters76472
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowYes
3rd rowYes
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No32325
89.2%
Yes3929
 
10.8%
Missing5
 
< 0.1%

Length

2022-05-08T17:23:09.702436image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:09.796457image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
no32325
89.2%
yes3929
 
10.8%
missing5
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N32325
42.3%
o32325
42.3%
s3939
 
5.2%
Y3929
 
5.1%
e3929
 
5.1%
i10
 
< 0.1%
M5
 
< 0.1%
n5
 
< 0.1%
g5
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter40213
52.6%
Uppercase Letter36259
47.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o32325
80.4%
s3939
 
9.8%
e3929
 
9.8%
i10
 
< 0.1%
n5
 
< 0.1%
g5
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N32325
89.2%
Y3929
 
10.8%
M5
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin76472
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N32325
42.3%
o32325
42.3%
s3939
 
5.2%
Y3929
 
5.1%
e3929
 
5.1%
i10
 
< 0.1%
M5
 
< 0.1%
n5
 
< 0.1%
g5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII76472
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N32325
42.3%
o32325
42.3%
s3939
 
5.2%
Y3929
 
5.1%
e3929
 
5.1%
i10
 
< 0.1%
M5
 
< 0.1%
n5
 
< 0.1%
g5
 
< 0.1%

household_income
Real number (ℝ≥0)

ZEROS

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.919357953
Minimum0
Maximum12
Zeros1813
Zeros (%)5.0%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:09.870473image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.9
Q14
median7
Q310
95-th percentile12
Maximum12
Range12
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.523152592
Coefficient of variation (CV)0.509173339
Kurtosis-0.9595620438
Mean6.919357953
Median Absolute Deviation (MAD)3
Skewness-0.1357985848
Sum250889
Variance12.41260419
MonotonicityNot monotonic
2022-05-08T17:23:09.956493image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
124906
13.5%
64154
11.5%
53947
10.9%
113839
10.6%
73419
9.4%
42803
7.7%
82796
7.7%
32407
6.6%
92119
5.8%
01813
 
5.0%
Other values (3)4056
11.2%
ValueCountFrequency (%)
01813
5.0%
1828
 
2.3%
21473
 
4.1%
32407
6.6%
42803
7.7%
53947
10.9%
64154
11.5%
73419
9.4%
82796
7.7%
92119
5.8%
ValueCountFrequency (%)
124906
13.5%
113839
10.6%
101755
 
4.8%
92119
5.8%
82796
7.7%
73419
9.4%
64154
11.5%
53947
10.9%
42803
7.7%
32407
6.6%

asthma
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
30902 
1.0
5357 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.030902
85.2%
1.05357
 
14.8%

Length

2022-05-08T17:23:10.044512image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:10.134533image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.030902
85.2%
1.05357
 
14.8%

Most occurring characters

ValueCountFrequency (%)
067161
61.7%
.36259
33.3%
15357
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
067161
92.6%
15357
 
7.4%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
067161
61.7%
.36259
33.3%
15357
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
067161
61.7%
.36259
33.3%
15357
 
4.9%

asthma_onset
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct82
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.142364654
Minimum0
Maximum85
Zeros30981
Zeros (%)85.4%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:10.228554image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile25
Maximum85
Range85
Interquartile range (IQR)0

Descriptive statistics

Standard deviation10.7631073
Coefficient of variation (CV)3.425161777
Kurtosis18.28610056
Mean3.142364654
Median Absolute Deviation (MAD)0
Skewness4.194778713
Sum113939
Variance115.8444788
MonotonicityNot monotonic
2022-05-08T17:23:10.339578image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
030981
85.4%
1482
 
1.3%
5306
 
0.8%
10272
 
0.8%
6228
 
0.6%
7220
 
0.6%
8212
 
0.6%
12193
 
0.5%
4159
 
0.4%
3153
 
0.4%
Other values (72)3053
 
8.4%
ValueCountFrequency (%)
030981
85.4%
1482
 
1.3%
2130
 
0.4%
3153
 
0.4%
4159
 
0.4%
5306
 
0.8%
6228
 
0.6%
7220
 
0.6%
8212
 
0.6%
9122
 
0.3%
ValueCountFrequency (%)
852
 
< 0.1%
8020
0.1%
792
 
< 0.1%
787
 
< 0.1%
774
 
< 0.1%
766
 
< 0.1%
7511
< 0.1%
746
 
< 0.1%
7314
< 0.1%
7212
< 0.1%

asthma_currently
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
33142 
1.0
 
3117

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.033142
91.4%
1.03117
 
8.6%

Length

2022-05-08T17:23:10.649231image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:10.737251image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.033142
91.4%
1.03117
 
8.6%

Most occurring characters

ValueCountFrequency (%)
069401
63.8%
.36259
33.3%
13117
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
069401
95.7%
13117
 
4.3%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
069401
63.8%
.36259
33.3%
13117
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
069401
63.8%
.36259
33.3%
13117
 
2.9%

asthma_emergency
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35760 
1.0
 
499

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035760
98.6%
1.0499
 
1.4%

Length

2022-05-08T17:23:10.812268image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:10.900884image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035760
98.6%
1.0499
 
1.4%

Most occurring characters

ValueCountFrequency (%)
072019
66.2%
.36259
33.3%
1499
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
072019
99.3%
1499
 
0.7%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
072019
66.2%
.36259
33.3%
1499
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
072019
66.2%
.36259
33.3%
1499
 
0.5%

anemia
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
34729 
1.0
 
1530

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.034729
95.8%
1.01530
 
4.2%

Length

2022-05-08T17:23:10.976901image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:11.063921image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.034729
95.8%
1.01530
 
4.2%

Most occurring characters

ValueCountFrequency (%)
070988
65.3%
.36259
33.3%
11530
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
070988
97.9%
11530
 
2.1%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
070988
65.3%
.36259
33.3%
11530
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
070988
65.3%
.36259
33.3%
11530
 
1.4%

ever_overweight
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
23742 
1.0
12517 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row1.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.023742
65.5%
1.012517
34.5%

Length

2022-05-08T17:23:11.138938image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:11.226957image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.023742
65.5%
1.012517
34.5%

Most occurring characters

ValueCountFrequency (%)
060001
55.2%
.36259
33.3%
112517
 
11.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
060001
82.7%
112517
 
17.3%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
060001
55.2%
.36259
33.3%
112517
 
11.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
060001
55.2%
.36259
33.3%
112517
 
11.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
32018 
1.0
4241 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.032018
88.3%
1.04241
 
11.7%

Length

2022-05-08T17:23:11.303975image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:11.391994image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.032018
88.3%
1.04241
 
11.7%

Most occurring characters

ValueCountFrequency (%)
068277
62.8%
.36259
33.3%
14241
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
068277
94.2%
14241
 
5.8%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
068277
62.8%
.36259
33.3%
14241
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
068277
62.8%
.36259
33.3%
14241
 
3.9%

arthritis
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
26863 
1.0
9396 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.026863
74.1%
1.09396
 
25.9%

Length

2022-05-08T17:23:11.468011image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:11.559031image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.026863
74.1%
1.09396
 
25.9%

Most occurring characters

ValueCountFrequency (%)
063122
58.0%
.36259
33.3%
19396
 
8.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
063122
87.0%
19396
 
13.0%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
063122
58.0%
.36259
33.3%
19396
 
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
063122
58.0%
.36259
33.3%
19396
 
8.6%

heart_failure
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35150 
1.0
 
1109

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035150
96.9%
1.01109
 
3.1%

Length

2022-05-08T17:23:11.636048image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:11.723644image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035150
96.9%
1.01109
 
3.1%

Most occurring characters

ValueCountFrequency (%)
071409
65.6%
.36259
33.3%
11109
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071409
98.5%
11109
 
1.5%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071409
65.6%
.36259
33.3%
11109
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071409
65.6%
.36259
33.3%
11109
 
1.0%

heart_disease
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
34872 
1.0
 
1387

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.034872
96.2%
1.01387
 
3.8%

Length

2022-05-08T17:23:11.797662image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:11.889681image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.034872
96.2%
1.01387
 
3.8%

Most occurring characters

ValueCountFrequency (%)
071131
65.4%
.36259
33.3%
11387
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071131
98.1%
11387
 
1.9%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071131
65.4%
.36259
33.3%
11387
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071131
65.4%
.36259
33.3%
11387
 
1.3%

angina
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35374 
1.0
 
885

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035374
97.6%
1.0885
 
2.4%

Length

2022-05-08T17:23:11.967699image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:12.062721image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035374
97.6%
1.0885
 
2.4%

Most occurring characters

ValueCountFrequency (%)
071633
65.9%
.36259
33.3%
1885
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071633
98.8%
1885
 
1.2%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071633
65.9%
.36259
33.3%
1885
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071633
65.9%
.36259
33.3%
1885
 
0.8%

heart_attack
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
34815 
1.0
 
1444

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.034815
96.0%
1.01444
 
4.0%

Length

2022-05-08T17:23:12.139737image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:12.228757image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.034815
96.0%
1.01444
 
4.0%

Most occurring characters

ValueCountFrequency (%)
071074
65.3%
.36259
33.3%
11444
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071074
98.0%
11444
 
2.0%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071074
65.3%
.36259
33.3%
11444
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071074
65.3%
.36259
33.3%
11444
 
1.3%

stroke
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
34975 
1.0
 
1284

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.034975
96.5%
1.01284
 
3.5%

Length

2022-05-08T17:23:12.303373image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:12.390393image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.034975
96.5%
1.01284
 
3.5%

Most occurring characters

ValueCountFrequency (%)
071234
65.5%
.36259
33.3%
11284
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071234
98.2%
11284
 
1.8%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071234
65.5%
.36259
33.3%
11284
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071234
65.5%
.36259
33.3%
11284
 
1.2%

emphysema
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35553 
1.0
 
706

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035553
98.1%
1.0706
 
1.9%

Length

2022-05-08T17:23:12.464410image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:12.552428image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035553
98.1%
1.0706
 
1.9%

Most occurring characters

ValueCountFrequency (%)
071812
66.0%
.36259
33.3%
1706
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071812
99.0%
1706
 
1.0%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071812
66.0%
.36259
33.3%
1706
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071812
66.0%
.36259
33.3%
1706
 
0.6%

bronchitis
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
34226 
1.0
 
2033

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.034226
94.4%
1.02033
 
5.6%

Length

2022-05-08T17:23:12.625445image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:12.714466image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.034226
94.4%
1.02033
 
5.6%

Most occurring characters

ValueCountFrequency (%)
070485
64.8%
.36259
33.3%
12033
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
070485
97.2%
12033
 
2.8%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
070485
64.8%
.36259
33.3%
12033
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
070485
64.8%
.36259
33.3%
12033
 
1.9%

liver_condition
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
34864 
1.0
 
1395

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.034864
96.2%
1.01395
 
3.8%

Length

2022-05-08T17:23:12.788061image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:12.876081image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.034864
96.2%
1.01395
 
3.8%

Most occurring characters

ValueCountFrequency (%)
071123
65.4%
.36259
33.3%
11395
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071123
98.1%
11395
 
1.9%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071123
65.4%
.36259
33.3%
11395
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071123
65.4%
.36259
33.3%
11395
 
1.3%

thyroid_problem
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
32746 
1.0
3513 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.032746
90.3%
1.03513
 
9.7%

Length

2022-05-08T17:23:12.950097image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:13.039117image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.032746
90.3%
1.03513
 
9.7%

Most occurring characters

ValueCountFrequency (%)
069005
63.4%
.36259
33.3%
13513
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
069005
95.2%
13513
 
4.8%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
069005
63.4%
.36259
33.3%
13513
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
069005
63.4%
.36259
33.3%
13513
 
3.2%

bronchitis_currently
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35305 
1.0
 
954

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035305
97.4%
1.0954
 
2.6%

Length

2022-05-08T17:23:13.115134image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:13.203154image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035305
97.4%
1.0954
 
2.6%

Most occurring characters

ValueCountFrequency (%)
071564
65.8%
.36259
33.3%
1954
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071564
98.7%
1954
 
1.3%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071564
65.8%
.36259
33.3%
1954
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071564
65.8%
.36259
33.3%
1954
 
0.9%

liver_condition_currently
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35501 
1.0
 
758

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035501
97.9%
1.0758
 
2.1%

Length

2022-05-08T17:23:13.277170image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:13.366191image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035501
97.9%
1.0758
 
2.1%

Most occurring characters

ValueCountFrequency (%)
071760
66.0%
.36259
33.3%
1758
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071760
99.0%
1758
 
1.0%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071760
66.0%
.36259
33.3%
1758
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071760
66.0%
.36259
33.3%
1758
 
0.7%

thyroid_problem_currently
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
33675 
1.0
 
2584

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.033675
92.9%
1.02584
 
7.1%

Length

2022-05-08T17:23:13.440207image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:13.527226image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.033675
92.9%
1.02584
 
7.1%

Most occurring characters

ValueCountFrequency (%)
069934
64.3%
.36259
33.3%
12584
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
069934
96.4%
12584
 
3.6%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
069934
64.3%
.36259
33.3%
12584
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
069934
64.3%
.36259
33.3%
12584
 
2.4%

cancer
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
32973 
1.0
 
3286

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.032973
90.9%
1.03286
 
9.1%

Length

2022-05-08T17:23:13.602243image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:13.691263image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.032973
90.9%
1.03286
 
9.1%

Most occurring characters

ValueCountFrequency (%)
069232
63.6%
.36259
33.3%
13286
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
069232
95.5%
13286
 
4.5%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
069232
63.6%
.36259
33.3%
13286
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
069232
63.6%
.36259
33.3%
13286
 
3.0%

first_cancer_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct31
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
None
33001 
Breast
 
521
Skin Non Melanoma
 
498
Prostate
 
494
Skin Other
 
241
Other values (26)
 
1504

Length

Max length17
Median length4
Mean length4.413331862
Min length4

Characters and Unicode

Total characters160023
Distinct characters37
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNone
2nd rowNone
3rd rowNone
4th rowNone
5th rowNone

Common Values

ValueCountFrequency (%)
None33001
91.0%
Breast521
 
1.4%
Skin Non Melanoma498
 
1.4%
Prostate494
 
1.4%
Skin Other241
 
0.7%
Cervical220
 
0.6%
Colon204
 
0.6%
Melanoma195
 
0.5%
Other143
 
0.4%
Uterine128
 
0.4%
Other values (21)614
 
1.7%

Length

2022-05-08T17:23:13.774281image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none33001
88.0%
skin739
 
2.0%
melanoma693
 
1.8%
breast521
 
1.4%
non498
 
1.3%
prostate494
 
1.3%
other384
 
1.0%
cervical220
 
0.6%
colon204
 
0.5%
uterine128
 
0.3%
Other values (23)619
 
1.7%

Most occurring characters

ValueCountFrequency (%)
e35862
22.4%
n35506
22.2%
o35329
22.1%
N33500
20.9%
a3073
 
1.9%
t2105
 
1.3%
r2024
 
1.3%
i1368
 
0.9%
l1256
 
0.8%
1242
 
0.8%
Other values (27)8758
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter121280
75.8%
Uppercase Letter37501
 
23.4%
Space Separator1242
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e35862
29.6%
n35506
29.3%
o35329
29.1%
a3073
 
2.5%
t2105
 
1.7%
r2024
 
1.7%
i1368
 
1.1%
l1256
 
1.0%
s1061
 
0.9%
m899
 
0.7%
Other values (12)2797
 
2.3%
Uppercase Letter
ValueCountFrequency (%)
N33500
89.3%
S768
 
2.0%
M712
 
1.9%
B632
 
1.7%
P502
 
1.3%
O451
 
1.2%
C424
 
1.1%
L209
 
0.6%
U128
 
0.3%
T85
 
0.2%
Other values (4)90
 
0.2%
Space Separator
ValueCountFrequency (%)
1242
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin158781
99.2%
Common1242
 
0.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e35862
22.6%
n35506
22.4%
o35329
22.3%
N33500
21.1%
a3073
 
1.9%
t2105
 
1.3%
r2024
 
1.3%
i1368
 
0.9%
l1256
 
0.8%
s1061
 
0.7%
Other values (26)7697
 
4.8%
Common
ValueCountFrequency (%)
1242
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII160023
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e35862
22.4%
n35506
22.2%
o35329
22.1%
N33500
20.9%
a3073
 
1.9%
t2105
 
1.3%
r2024
 
1.3%
i1368
 
0.9%
l1256
 
0.8%
1242
 
0.8%
Other values (27)8758
 
5.5%

second_cancer_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct27
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
None
35918 
Skin Non Melanoma
 
40
Prostate
 
37
Skin Other
 
34
Other
 
29
Other values (22)
 
201

Length

Max length17
Median length4
Mean length4.038859318
Min length4

Characters and Unicode

Total characters146445
Distinct characters34
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone
2nd rowNone
3rd rowNone
4th rowNone
5th rowNone

Common Values

ValueCountFrequency (%)
None35918
99.1%
Skin Non Melanoma40
 
0.1%
Prostate37
 
0.1%
Skin Other34
 
0.1%
Other29
 
0.1%
Melanoma28
 
0.1%
Colon27
 
0.1%
Ovarian22
 
0.1%
Breast18
 
< 0.1%
Uterine17
 
< 0.1%
Other values (17)89
 
0.2%

Length

2022-05-08T17:23:14.048946image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none35918
98.7%
skin74
 
0.2%
melanoma68
 
0.2%
other63
 
0.2%
non40
 
0.1%
prostate37
 
0.1%
colon27
 
0.1%
ovarian22
 
0.1%
breast18
 
< 0.1%
uterine17
 
< 0.1%
Other values (17)89
 
0.2%

Most occurring characters

ValueCountFrequency (%)
e36199
24.7%
n36195
24.7%
o36150
24.7%
N35958
24.6%
a289
 
0.2%
r196
 
0.1%
t186
 
0.1%
i149
 
0.1%
l125
 
0.1%
114
 
0.1%
Other values (24)884
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter109958
75.1%
Uppercase Letter36373
 
24.8%
Space Separator114
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e36199
32.9%
n36195
32.9%
o36150
32.9%
a289
 
0.3%
r196
 
0.2%
t186
 
0.2%
i149
 
0.1%
l125
 
0.1%
m90
 
0.1%
h85
 
0.1%
Other values (10)294
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
N35958
98.9%
O85
 
0.2%
S77
 
0.2%
M71
 
0.2%
B43
 
0.1%
P39
 
0.1%
C33
 
0.1%
L26
 
0.1%
U17
 
< 0.1%
K9
 
< 0.1%
Other values (3)15
 
< 0.1%
Space Separator
ValueCountFrequency (%)
114
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin146331
99.9%
Common114
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e36199
24.7%
n36195
24.7%
o36150
24.7%
N35958
24.6%
a289
 
0.2%
r196
 
0.1%
t186
 
0.1%
i149
 
0.1%
l125
 
0.1%
m90
 
0.1%
Other values (23)794
 
0.5%
Common
ValueCountFrequency (%)
114
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII146445
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e36199
24.7%
n36195
24.7%
o36150
24.7%
N35958
24.6%
a289
 
0.2%
r196
 
0.1%
t186
 
0.1%
i149
 
0.1%
l125
 
0.1%
114
 
0.1%
Other values (24)884
 
0.6%

third_cancer_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
None
36220 
Other
 
5
Skin Non Melanoma
 
5
Skin Other
 
4
Uterine
 
3
Other values (13)
 
22

Length

Max length17
Median length4
Mean length4.004247221
Min length4

Characters and Unicode

Total characters145190
Distinct characters29
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowNone
2nd rowNone
3rd rowNone
4th rowNone
5th rowNone

Common Values

ValueCountFrequency (%)
None36220
99.9%
Other5
 
< 0.1%
Skin Non Melanoma5
 
< 0.1%
Skin Other4
 
< 0.1%
Uterine3
 
< 0.1%
Melanoma3
 
< 0.1%
Thyroid3
 
< 0.1%
Ovarian2
 
< 0.1%
Liver2
 
< 0.1%
Colon2
 
< 0.1%
Other values (8)10
 
< 0.1%

Length

2022-05-08T17:23:14.145967image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none36220
99.9%
other9
 
< 0.1%
skin9
 
< 0.1%
melanoma8
 
< 0.1%
non5
 
< 0.1%
uterine3
 
< 0.1%
thyroid3
 
< 0.1%
prostate2
 
< 0.1%
lung2
 
< 0.1%
colon2
 
< 0.1%
Other values (8)10
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
n36253
25.0%
e36252
25.0%
o36245
25.0%
N36225
25.0%
a26
 
< 0.1%
r24
 
< 0.1%
i22
 
< 0.1%
t17
 
< 0.1%
14
 
< 0.1%
h12
 
< 0.1%
Other values (19)100
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter108903
75.0%
Uppercase Letter36273
 
25.0%
Space Separator14
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n36253
33.3%
e36252
33.3%
o36245
33.3%
a26
 
< 0.1%
r24
 
< 0.1%
i22
 
< 0.1%
t17
 
< 0.1%
h12
 
< 0.1%
l12
 
< 0.1%
k10
 
< 0.1%
Other values (8)30
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N36225
99.9%
O11
 
< 0.1%
S9
 
< 0.1%
M8
 
< 0.1%
L5
 
< 0.1%
B4
 
< 0.1%
T3
 
< 0.1%
C3
 
< 0.1%
U3
 
< 0.1%
P2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin145176
> 99.9%
Common14
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n36253
25.0%
e36252
25.0%
o36245
25.0%
N36225
25.0%
a26
 
< 0.1%
r24
 
< 0.1%
i22
 
< 0.1%
t17
 
< 0.1%
h12
 
< 0.1%
l12
 
< 0.1%
Other values (18)88
 
0.1%
Common
ValueCountFrequency (%)
14
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII145190
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n36253
25.0%
e36252
25.0%
o36245
25.0%
N36225
25.0%
a26
 
< 0.1%
r24
 
< 0.1%
i22
 
< 0.1%
t17
 
< 0.1%
14
 
< 0.1%
h12
 
< 0.1%
Other values (19)100
 
0.1%

fourth_cancer_count
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
36255 
1.0
 
4

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.036255
> 99.9%
1.04
 
< 0.1%

Length

2022-05-08T17:23:14.235987image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:14.323007image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.036255
> 99.9%
1.04
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
072514
66.7%
.36259
33.3%
14
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
072514
> 99.9%
14
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
072514
66.7%
.36259
33.3%
14
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
072514
66.7%
.36259
33.3%
14
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
32053 
1.0
4206 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.032053
88.4%
1.04206
 
11.6%

Length

2022-05-08T17:23:14.398024image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:14.486043image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.032053
88.4%
1.04206
 
11.6%

Most occurring characters

ValueCountFrequency (%)
068312
62.8%
.36259
33.3%
14206
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
068312
94.2%
14206
 
5.8%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
068312
62.8%
.36259
33.3%
14206
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
068312
62.8%
.36259
33.3%
14206
 
3.9%

asthma_relative
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
28799 
1.0
7460 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.028799
79.4%
1.07460
 
20.6%

Length

2022-05-08T17:23:14.562060image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:14.651688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.028799
79.4%
1.07460
 
20.6%

Most occurring characters

ValueCountFrequency (%)
065058
59.8%
.36259
33.3%
17460
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
065058
89.7%
17460
 
10.3%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
065058
59.8%
.36259
33.3%
17460
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
065058
59.8%
.36259
33.3%
17460
 
6.9%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
21984 
1.0
14275 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.021984
60.6%
1.014275
39.4%

Length

2022-05-08T17:23:14.727705image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:14.815725image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.021984
60.6%
1.014275
39.4%

Most occurring characters

ValueCountFrequency (%)
058243
53.5%
.36259
33.3%
114275
 
13.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
058243
80.3%
114275
 
19.7%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
058243
53.5%
.36259
33.3%
114275
 
13.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
058243
53.5%
.36259
33.3%
114275
 
13.1%

hay_fever
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35815 
1.0
 
444

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035815
98.8%
1.0444
 
1.2%

Length

2022-05-08T17:23:14.891741image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:14.978761image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035815
98.8%
1.0444
 
1.2%

Most occurring characters

ValueCountFrequency (%)
072074
66.3%
.36259
33.3%
1444
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
072074
99.4%
1444
 
0.6%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
072074
66.3%
.36259
33.3%
1444
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
072074
66.3%
.36259
33.3%
1444
 
0.4%

arthritis_onset
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct86
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.59731377
Minimum0
Maximum85
Zeros27002
Zeros (%)74.5%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:15.069781image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q314
95-th percentile64
Maximum85
Range85
Interquartile range (IQR)14

Descriptive statistics

Standard deviation23.06201685
Coefficient of variation (CV)1.830709092
Kurtosis0.666210629
Mean12.59731377
Median Absolute Deviation (MAD)0
Skewness1.500282105
Sum456766
Variance531.8566214
MonotonicityNot monotonic
2022-05-08T17:23:15.180398image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
027002
74.5%
50678
 
1.9%
60500
 
1.4%
40489
 
1.3%
55438
 
1.2%
45401
 
1.1%
65304
 
0.8%
70274
 
0.8%
35246
 
0.7%
30241
 
0.7%
Other values (76)5686
 
15.7%
ValueCountFrequency (%)
027002
74.5%
16
 
< 0.1%
29
 
< 0.1%
38
 
< 0.1%
48
 
< 0.1%
515
 
< 0.1%
69
 
< 0.1%
710
 
< 0.1%
814
 
< 0.1%
99
 
< 0.1%
ValueCountFrequency (%)
8512
 
< 0.1%
841
 
< 0.1%
832
 
< 0.1%
825
 
< 0.1%
812
 
< 0.1%
80196
0.5%
7934
 
0.1%
7857
 
0.2%
7745
 
0.1%
7650
 
0.1%

heart_failure_onset
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct77
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.720152238
Minimum0
Maximum85
Zeros35164
Zeros (%)97.0%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:15.295423image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum85
Range85
Interquartile range (IQR)0

Descriptive statistics

Standard deviation10.10442933
Coefficient of variation (CV)5.874148292
Kurtosis35.25451661
Mean1.720152238
Median Absolute Deviation (MAD)0
Skewness5.985168142
Sum62371
Variance102.0994921
MonotonicityNot monotonic
2022-05-08T17:23:15.406448image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
035164
97.0%
8060
 
0.2%
6051
 
0.1%
5049
 
0.1%
6547
 
0.1%
7047
 
0.1%
5539
 
0.1%
5831
 
0.1%
5330
 
0.1%
4529
 
0.1%
Other values (67)712
 
2.0%
ValueCountFrequency (%)
035164
97.0%
15
 
< 0.1%
21
 
< 0.1%
31
 
< 0.1%
41
 
< 0.1%
63
 
< 0.1%
102
 
< 0.1%
122
 
< 0.1%
131
 
< 0.1%
153
 
< 0.1%
ValueCountFrequency (%)
856
 
< 0.1%
841
 
< 0.1%
832
 
< 0.1%
812
 
< 0.1%
8060
0.2%
7911
 
< 0.1%
7812
 
< 0.1%
775
 
< 0.1%
7613
 
< 0.1%
7518
 
< 0.1%

heart_disease_onset
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct76
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.162166634
Minimum0
Maximum85
Zeros34890
Zeros (%)96.2%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:15.524474image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum85
Range85
Interquartile range (IQR)0

Descriptive statistics

Standard deviation11.23360833
Coefficient of variation (CV)5.195533106
Kurtosis26.21103411
Mean2.162166634
Median Absolute Deviation (MAD)0
Skewness5.216191319
Sum78398
Variance126.1939561
MonotonicityNot monotonic
2022-05-08T17:23:15.637499image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
034890
96.2%
6074
 
0.2%
5067
 
0.2%
5566
 
0.2%
6554
 
0.1%
8049
 
0.1%
5846
 
0.1%
7043
 
0.1%
5643
 
0.1%
6640
 
0.1%
Other values (66)887
 
2.4%
ValueCountFrequency (%)
034890
96.2%
11
 
< 0.1%
31
 
< 0.1%
53
 
< 0.1%
61
 
< 0.1%
71
 
< 0.1%
82
 
< 0.1%
102
 
< 0.1%
115
 
< 0.1%
122
 
< 0.1%
ValueCountFrequency (%)
854
 
< 0.1%
823
 
< 0.1%
811
 
< 0.1%
8049
0.1%
7912
 
< 0.1%
7815
 
< 0.1%
779
 
< 0.1%
7614
 
< 0.1%
7520
0.1%
7420
0.1%

angina_onset
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct73
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.260321575
Minimum0
Maximum85
Zeros35387
Zeros (%)97.6%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:15.754525image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum85
Range85
Interquartile range (IQR)0

Descriptive statistics

Standard deviation8.359120273
Coefficient of variation (CV)6.632529694
Kurtosis47.57952611
Mean1.260321575
Median Absolute Deviation (MAD)0
Skewness6.875207816
Sum45698
Variance69.87489174
MonotonicityNot monotonic
2022-05-08T17:23:15.864152image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
035387
97.6%
6045
 
0.1%
5039
 
0.1%
4536
 
0.1%
4034
 
0.1%
5532
 
0.1%
5827
 
0.1%
5924
 
0.1%
3523
 
0.1%
4623
 
0.1%
Other values (63)589
 
1.6%
ValueCountFrequency (%)
035387
97.6%
73
 
< 0.1%
82
 
< 0.1%
103
 
< 0.1%
122
 
< 0.1%
154
 
< 0.1%
164
 
< 0.1%
171
 
< 0.1%
183
 
< 0.1%
207
 
< 0.1%
ValueCountFrequency (%)
852
 
< 0.1%
842
 
< 0.1%
821
 
< 0.1%
811
 
< 0.1%
8018
< 0.1%
797
 
< 0.1%
783
 
< 0.1%
779
< 0.1%
767
 
< 0.1%
759
< 0.1%

heart_attack_onset
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct73
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.176232108
Minimum0
Maximum85
Zeros34837
Zeros (%)96.1%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:15.980178image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum85
Range85
Interquartile range (IQR)0

Descriptive statistics

Standard deviation11.12234436
Coefficient of variation (CV)5.110826333
Kurtosis26.08220172
Mean2.176232108
Median Absolute Deviation (MAD)0
Skewness5.180584037
Sum78908
Variance123.7065441
MonotonicityNot monotonic
2022-05-08T17:23:16.091202image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
034837
96.1%
5062
 
0.2%
6060
 
0.2%
5558
 
0.2%
6547
 
0.1%
8047
 
0.1%
5843
 
0.1%
4541
 
0.1%
5640
 
0.1%
7039
 
0.1%
Other values (63)985
 
2.7%
ValueCountFrequency (%)
034837
96.1%
21
 
< 0.1%
102
 
< 0.1%
141
 
< 0.1%
152
 
< 0.1%
164
 
< 0.1%
191
 
< 0.1%
207
 
< 0.1%
213
 
< 0.1%
221
 
< 0.1%
ValueCountFrequency (%)
858
 
< 0.1%
842
 
< 0.1%
832
 
< 0.1%
822
 
< 0.1%
811
 
< 0.1%
8047
0.1%
7910
 
< 0.1%
7815
 
< 0.1%
7711
 
< 0.1%
7610
 
< 0.1%

stroke_onset
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct79
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.966573816
Minimum0
Maximum85
Zeros35000
Zeros (%)96.5%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:16.208228image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum85
Range85
Interquartile range (IQR)0

Descriptive statistics

Standard deviation10.78401004
Coefficient of variation (CV)5.48365383
Kurtosis30.62400557
Mean1.966573816
Median Absolute Deviation (MAD)0
Skewness5.592836639
Sum71306
Variance116.2948725
MonotonicityNot monotonic
2022-05-08T17:23:16.320836image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
035000
96.5%
8075
 
0.2%
6553
 
0.1%
5545
 
0.1%
5045
 
0.1%
7043
 
0.1%
6039
 
0.1%
5336
 
0.1%
6235
 
0.1%
5834
 
0.1%
Other values (69)854
 
2.4%
ValueCountFrequency (%)
035000
96.5%
12
 
< 0.1%
22
 
< 0.1%
31
 
< 0.1%
41
 
< 0.1%
51
 
< 0.1%
61
 
< 0.1%
73
 
< 0.1%
101
 
< 0.1%
131
 
< 0.1%
ValueCountFrequency (%)
854
 
< 0.1%
841
 
< 0.1%
812
 
< 0.1%
8075
0.2%
7918
 
< 0.1%
7818
 
< 0.1%
7713
 
< 0.1%
7619
 
0.1%
7521
 
0.1%
7416
 
< 0.1%

emphysema_onset
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct74
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.032157533
Minimum0
Maximum85
Zeros35566
Zeros (%)98.1%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:16.437864image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum85
Range85
Interquartile range (IQR)0

Descriptive statistics

Standard deviation7.682353009
Coefficient of variation (CV)7.443004348
Kurtosis59.92145013
Mean1.032157533
Median Absolute Deviation (MAD)0
Skewness7.708904052
Sum37425
Variance59.01854775
MonotonicityNot monotonic
2022-05-08T17:23:16.547888image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
035566
98.1%
6041
 
0.1%
6533
 
0.1%
5031
 
0.1%
4531
 
0.1%
5528
 
0.1%
4024
 
0.1%
7023
 
0.1%
6618
 
< 0.1%
5817
 
< 0.1%
Other values (64)447
 
1.2%
ValueCountFrequency (%)
035566
98.1%
41
 
< 0.1%
51
 
< 0.1%
61
 
< 0.1%
101
 
< 0.1%
111
 
< 0.1%
132
 
< 0.1%
142
 
< 0.1%
162
 
< 0.1%
181
 
< 0.1%
ValueCountFrequency (%)
851
 
< 0.1%
831
 
< 0.1%
822
 
< 0.1%
811
 
< 0.1%
8010
< 0.1%
794
 
< 0.1%
783
 
< 0.1%
777
< 0.1%
767
< 0.1%
7514
< 0.1%

bronchitis_onset
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct82
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.897101409
Minimum0
Maximum85
Zeros34295
Zeros (%)94.6%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:16.666914image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile7
Maximum85
Range85
Interquartile range (IQR)0

Descriptive statistics

Standard deviation9.252438597
Coefficient of variation (CV)4.87714497
Kurtosis31.37586206
Mean1.897101409
Median Absolute Deviation (MAD)0
Skewness5.497210681
Sum68787
Variance85.60761998
MonotonicityNot monotonic
2022-05-08T17:23:16.960561image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
034295
94.6%
16114
 
0.3%
5086
 
0.2%
4083
 
0.2%
3074
 
0.2%
3561
 
0.2%
2558
 
0.2%
2051
 
0.1%
5549
 
0.1%
6044
 
0.1%
Other values (72)1344
 
3.7%
ValueCountFrequency (%)
034295
94.6%
131
 
0.1%
217
 
< 0.1%
312
 
< 0.1%
418
 
< 0.1%
530
 
0.1%
623
 
0.1%
725
 
0.1%
827
 
0.1%
915
 
< 0.1%
ValueCountFrequency (%)
852
 
< 0.1%
8019
0.1%
792
 
< 0.1%
788
< 0.1%
773
 
< 0.1%
769
< 0.1%
7513
< 0.1%
749
< 0.1%
7312
< 0.1%
727
 
< 0.1%

liver_condition_onset
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct81
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.598085992
Minimum0
Maximum82
Zeros34879
Zeros (%)96.2%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:17.077587image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum82
Range82
Interquartile range (IQR)0

Descriptive statistics

Standard deviation8.687700538
Coefficient of variation (CV)5.436316055
Kurtosis34.4114687
Mean1.598085992
Median Absolute Deviation (MAD)0
Skewness5.814174448
Sum57945
Variance75.47614064
MonotonicityNot monotonic
2022-05-08T17:23:17.189613image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
034879
96.2%
3560
 
0.2%
4558
 
0.2%
4053
 
0.1%
5049
 
0.1%
3039
 
0.1%
5539
 
0.1%
6038
 
0.1%
2533
 
0.1%
4231
 
0.1%
Other values (71)980
 
2.7%
ValueCountFrequency (%)
034879
96.2%
14
 
< 0.1%
31
 
< 0.1%
42
 
< 0.1%
53
 
< 0.1%
67
 
< 0.1%
78
 
< 0.1%
85
 
< 0.1%
94
 
< 0.1%
1011
 
< 0.1%
ValueCountFrequency (%)
821
 
< 0.1%
8010
< 0.1%
793
 
< 0.1%
783
 
< 0.1%
773
 
< 0.1%
762
 
< 0.1%
754
 
< 0.1%
746
< 0.1%
737
< 0.1%
723
 
< 0.1%

thyroid_problem_onset
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct84
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.216497973
Minimum0
Maximum85
Zeros32795
Zeros (%)90.4%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:17.307639image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile42
Maximum85
Range85
Interquartile range (IQR)0

Descriptive statistics

Standard deviation14.12490136
Coefficient of variation (CV)3.349912996
Kurtosis11.36041647
Mean4.216497973
Median Absolute Deviation (MAD)0
Skewness3.476290574
Sum152886
Variance199.5128383
MonotonicityNot monotonic
2022-05-08T17:23:17.419664image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
032795
90.4%
40166
 
0.5%
50161
 
0.4%
30144
 
0.4%
45112
 
0.3%
35112
 
0.3%
55107
 
0.3%
60102
 
0.3%
2588
 
0.2%
8087
 
0.2%
Other values (74)2385
 
6.6%
ValueCountFrequency (%)
032795
90.4%
12
 
< 0.1%
23
 
< 0.1%
41
 
< 0.1%
59
 
< 0.1%
61
 
< 0.1%
77
 
< 0.1%
89
 
< 0.1%
95
 
< 0.1%
1011
 
< 0.1%
ValueCountFrequency (%)
853
 
< 0.1%
841
 
< 0.1%
831
 
< 0.1%
822
 
< 0.1%
8087
0.2%
7910
 
< 0.1%
7812
 
< 0.1%
7719
 
0.1%
7617
 
< 0.1%
7544
0.1%

cancer_onset
Real number (ℝ≥0)

ZEROS

Distinct65
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.7822609559
Minimum0
Maximum85
Zeros35750
Zeros (%)98.6%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:17.537690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum85
Range85
Interquartile range (IQR)0

Descriptive statistics

Standard deviation6.823851538
Coefficient of variation (CV)8.723241888
Kurtosis84.41872073
Mean0.7822609559
Median Absolute Deviation (MAD)0
Skewness9.106085267
Sum28364
Variance46.56494981
MonotonicityNot monotonic
2022-05-08T17:23:17.647715image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
035750
98.6%
8033
 
0.1%
5020
 
0.1%
6016
 
< 0.1%
6415
 
< 0.1%
5915
 
< 0.1%
6515
 
< 0.1%
6214
 
< 0.1%
5514
 
< 0.1%
6314
 
< 0.1%
Other values (55)353
 
1.0%
ValueCountFrequency (%)
035750
98.6%
163
 
< 0.1%
171
 
< 0.1%
192
 
< 0.1%
204
 
< 0.1%
212
 
< 0.1%
224
 
< 0.1%
233
 
< 0.1%
242
 
< 0.1%
253
 
< 0.1%
ValueCountFrequency (%)
851
 
< 0.1%
8033
0.1%
795
 
< 0.1%
785
 
< 0.1%
773
 
< 0.1%
768
 
< 0.1%
7512
 
< 0.1%
747
 
< 0.1%
735
 
< 0.1%
728
 
< 0.1%

arthritis_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
Missing
29748 
Osteoarthritis
3576 
Rheumatoid
 
1797
Other
 
1057
Psoriatic
 
81

Length

Max length14
Median length7
Mean length7.785211947
Min length5

Characters and Unicode

Total characters282284
Distinct characters18
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMissing
2nd rowMissing
3rd rowMissing
4th rowMissing
5th rowMissing

Common Values

ValueCountFrequency (%)
Missing29748
82.0%
Osteoarthritis3576
 
9.9%
Rheumatoid1797
 
5.0%
Other1057
 
2.9%
Psoriatic81
 
0.2%

Length

2022-05-08T17:23:17.750337image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:17.846359image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
missing29748
82.0%
osteoarthritis3576
 
9.9%
rheumatoid1797
 
5.0%
other1057
 
2.9%
psoriatic81
 
0.2%

Most occurring characters

ValueCountFrequency (%)
i68607
24.3%
s66729
23.6%
M29748
10.5%
n29748
10.5%
g29748
10.5%
t13663
 
4.8%
r8290
 
2.9%
h6430
 
2.3%
e6430
 
2.3%
a5454
 
1.9%
Other values (8)17437
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter246025
87.2%
Uppercase Letter36259
 
12.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i68607
27.9%
s66729
27.1%
n29748
12.1%
g29748
12.1%
t13663
 
5.6%
r8290
 
3.4%
h6430
 
2.6%
e6430
 
2.6%
a5454
 
2.2%
o5454
 
2.2%
Other values (4)5472
 
2.2%
Uppercase Letter
ValueCountFrequency (%)
M29748
82.0%
O4633
 
12.8%
R1797
 
5.0%
P81
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin282284
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i68607
24.3%
s66729
23.6%
M29748
10.5%
n29748
10.5%
g29748
10.5%
t13663
 
4.8%
r8290
 
2.9%
h6430
 
2.3%
e6430
 
2.3%
a5454
 
1.9%
Other values (8)17437
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII282284
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i68607
24.3%
s66729
23.6%
M29748
10.5%
n29748
10.5%
g29748
10.5%
t13663
 
4.8%
r8290
 
2.9%
h6430
 
2.3%
e6430
 
2.3%
a5454
 
1.9%
Other values (8)17437
 
6.2%

first_cancer_count
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
33001 
1.0
 
3258

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.033001
91.0%
1.03258
 
9.0%

Length

2022-05-08T17:23:17.937962image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:18.027981image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.033001
91.0%
1.03258
 
9.0%

Most occurring characters

ValueCountFrequency (%)
069260
63.7%
.36259
33.3%
13258
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
069260
95.5%
13258
 
4.5%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
069260
63.7%
.36259
33.3%
13258
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
069260
63.7%
.36259
33.3%
13258
 
3.0%

second_cancer_count
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35918 
1.0
 
341

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035918
99.1%
1.0341
 
0.9%

Length

2022-05-08T17:23:18.104998image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:18.193018image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035918
99.1%
1.0341
 
0.9%

Most occurring characters

ValueCountFrequency (%)
072177
66.4%
.36259
33.3%
1341
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
072177
99.5%
1341
 
0.5%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
072177
66.4%
.36259
33.3%
1341
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
072177
66.4%
.36259
33.3%
1341
 
0.3%

third_cancer_count
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
36220 
1.0
 
39

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.036220
99.9%
1.039
 
0.1%

Length

2022-05-08T17:23:18.269036image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:18.360055image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.036220
99.9%
1.039
 
0.1%

Most occurring characters

ValueCountFrequency (%)
072479
66.6%
.36259
33.3%
139
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
072479
99.9%
139
 
0.1%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
072479
66.6%
.36259
33.3%
139
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
072479
66.6%
.36259
33.3%
139
 
< 0.1%

weight
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1372
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean80.86443642
Minimum0
Maximum371
Zeros332
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:18.450075image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile51.6
Q166
median78.2
Q392.7
95-th percentile121.6
Maximum371
Range371
Interquartile range (IQR)26.7

Descriptive statistics

Standard deviation23.05759459
Coefficient of variation (CV)0.285138877
Kurtosis3.550191507
Mean80.86443642
Median Absolute Deviation (MAD)13.2
Skewness0.6860359384
Sum2932063.6
Variance531.6526684
MonotonicityNot monotonic
2022-05-08T17:23:18.561698image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0332
 
0.9%
73.895
 
0.3%
78.694
 
0.3%
79.393
 
0.3%
77.391
 
0.3%
65.391
 
0.3%
69.291
 
0.3%
72.190
 
0.2%
71.789
 
0.2%
7589
 
0.2%
Other values (1362)35104
96.8%
ValueCountFrequency (%)
0332
0.9%
29.11
 
< 0.1%
32.31
 
< 0.1%
32.42
 
< 0.1%
32.81
 
< 0.1%
33.21
 
< 0.1%
34.71
 
< 0.1%
35.21
 
< 0.1%
35.91
 
< 0.1%
361
 
< 0.1%
ValueCountFrequency (%)
3711
< 0.1%
259.51
< 0.1%
242.61
< 0.1%
239.41
< 0.1%
230.71
< 0.1%
2231
< 0.1%
222.61
< 0.1%
219.61
< 0.1%
218.61
< 0.1%
218.21
< 0.1%

height
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct617
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean165.7644717
Minimum0
Maximum204.5
Zeros328
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:18.684726image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile150.5
Q1159.7
median166.9
Q3174.5
95-th percentile184.1
Maximum204.5
Range204.5
Interquartile range (IQR)14.8

Descriptive statistics

Standard deviation18.76785245
Coefficient of variation (CV)0.1132199937
Kurtosis52.29606899
Mean165.7644717
Median Absolute Deviation (MAD)7.4
Skewness-6.147017842
Sum6010453.98
Variance352.2322856
MonotonicityNot monotonic
2022-05-08T17:23:18.800752image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0328
 
0.9%
164.9163
 
0.4%
161.7151
 
0.4%
165.5148
 
0.4%
165.1145
 
0.4%
168.4145
 
0.4%
167.7145
 
0.4%
167.4144
 
0.4%
160.6143
 
0.4%
165.9143
 
0.4%
Other values (607)34604
95.4%
ValueCountFrequency (%)
0328
0.9%
123.31
 
< 0.1%
134.51
 
< 0.1%
135.31
 
< 0.1%
135.41
 
< 0.1%
136.11
 
< 0.1%
136.31
 
< 0.1%
136.51
 
< 0.1%
137.31
 
< 0.1%
137.41
 
< 0.1%
ValueCountFrequency (%)
204.51
 
< 0.1%
204.11
 
< 0.1%
203.81
 
< 0.1%
202.73
< 0.1%
202.61
 
< 0.1%
201.71
 
< 0.1%
201.61
 
< 0.1%
2011
 
< 0.1%
200.71
 
< 0.1%
200.41
 
< 0.1%

BMI
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct3139
Distinct (%)8.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.77315729
Minimum0
Maximum130.21
Zeros386
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:18.923779image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile19.6
Q124.045
median27.86
Q332.5
95-th percentile42.071
Maximum130.21
Range130.21
Interquartile range (IQR)8.455

Descriptive statistics

Standard deviation7.60542893
Coefficient of variation (CV)0.2643237534
Kurtosis4.560279278
Mean28.77315729
Median Absolute Deviation (MAD)4.14
Skewness0.5665297884
Sum1043285.91
Variance57.84254921
MonotonicityNot monotonic
2022-05-08T17:23:19.044388image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0386
 
1.1%
26.1163
 
0.4%
28163
 
0.4%
26.6161
 
0.4%
25.8157
 
0.4%
28.9157
 
0.4%
23.9155
 
0.4%
27.9154
 
0.4%
26.2154
 
0.4%
27.8153
 
0.4%
Other values (3129)34456
95.0%
ValueCountFrequency (%)
0386
1.1%
13.181
 
< 0.1%
13.361
 
< 0.1%
13.41
 
< 0.1%
13.61
 
< 0.1%
14.11
 
< 0.1%
14.22
 
< 0.1%
14.31
 
< 0.1%
14.51
 
< 0.1%
14.591
 
< 0.1%
ValueCountFrequency (%)
130.211
< 0.1%
86.21
< 0.1%
84.41
< 0.1%
82.91
< 0.1%
82.11
< 0.1%
81.251
< 0.1%
77.51
< 0.1%
76.071
< 0.1%
74.81
< 0.1%
74.11
< 0.1%

pulse
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct58
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean71.28602554
Minimum0
Maximum224
Zeros728
Zeros (%)2.0%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:19.165415image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile54
Q164
median72
Q380
95-th percentile94
Maximum224
Range224
Interquartile range (IQR)16

Descriptive statistics

Standard deviation15.74826499
Coefficient of variation (CV)0.2209165803
Kurtosis7.355700398
Mean71.28602554
Median Absolute Deviation (MAD)8
Skewness-1.38234814
Sum2584760
Variance248.0078501
MonotonicityNot monotonic
2022-05-08T17:23:19.276440image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
702527
 
7.0%
722392
 
6.6%
742361
 
6.5%
682360
 
6.5%
662290
 
6.3%
642160
 
6.0%
762059
 
5.7%
781921
 
5.3%
621891
 
5.2%
601822
 
5.0%
Other values (48)14476
39.9%
ValueCountFrequency (%)
0728
2.0%
341
 
< 0.1%
362
 
< 0.1%
4012
 
< 0.1%
4217
 
< 0.1%
4444
 
0.1%
4679
 
0.2%
48144
 
0.4%
50277
 
0.8%
52424
1.2%
ValueCountFrequency (%)
2241
 
< 0.1%
2201
 
< 0.1%
1721
 
< 0.1%
1661
 
< 0.1%
1601
 
< 0.1%
1421
 
< 0.1%
1403
< 0.1%
1365
< 0.1%
1341
 
< 0.1%
1322
 
< 0.1%

irregular_pulse
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35262 
1.0
 
997

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035262
97.3%
1.0997
 
2.7%

Length

2022-05-08T17:23:19.377463image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:19.463482image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035262
97.3%
1.0997
 
2.7%

Most occurring characters

ValueCountFrequency (%)
071521
65.8%
.36259
33.3%
1997
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071521
98.6%
1997
 
1.4%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071521
65.8%
.36259
33.3%
1997
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071521
65.8%
.36259
33.3%
1997
 
0.9%

systolic
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct86
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean115.4648501
Minimum0
Maximum270
Zeros2530
Zeros (%)7.0%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:19.549501image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1108
median120
Q3132
95-th percentile158
Maximum270
Range270
Interquartile range (IQR)24

Descriptive statistics

Standard deviation36.4544135
Coefficient of variation (CV)0.3157187098
Kurtosis4.572251902
Mean115.4648501
Median Absolute Deviation (MAD)12
Skewness-1.892400969
Sum4186640
Variance1328.924264
MonotonicityNot monotonic
2022-05-08T17:23:19.667125image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02530
 
7.0%
1161934
 
5.3%
1141886
 
5.2%
1241669
 
4.6%
1181657
 
4.6%
1261595
 
4.4%
1121504
 
4.1%
1201500
 
4.1%
1221486
 
4.1%
1101422
 
3.9%
Other values (76)19076
52.6%
ValueCountFrequency (%)
02530
7.0%
661
 
< 0.1%
722
 
< 0.1%
743
 
< 0.1%
782
 
< 0.1%
806
 
< 0.1%
8215
 
< 0.1%
8424
 
0.1%
8639
 
0.1%
8858
 
0.2%
ValueCountFrequency (%)
2701
 
< 0.1%
2561
 
< 0.1%
2381
 
< 0.1%
2361
 
< 0.1%
2321
 
< 0.1%
2303
< 0.1%
2283
< 0.1%
2261
 
< 0.1%
2242
< 0.1%
2222
< 0.1%

diastolic
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct61
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.98199068
Minimum0
Maximum134
Zeros2724
Zeros (%)7.5%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:19.964804image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q160
median70
Q378
95-th percentile90
Maximum134
Range134
Interquartile range (IQR)18

Descriptive statistics

Standard deviation21.90688756
Coefficient of variation (CV)0.3371224447
Kurtosis3.24796726
Mean64.98199068
Median Absolute Deviation (MAD)8
Skewness-1.739634619
Sum2356182
Variance479.9117227
MonotonicityNot monotonic
2022-05-08T17:23:20.081830image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02724
 
7.5%
722412
 
6.7%
742403
 
6.6%
702171
 
6.0%
682170
 
6.0%
662115
 
5.8%
762114
 
5.8%
642051
 
5.7%
621817
 
5.0%
781774
 
4.9%
Other values (51)14508
40.0%
ValueCountFrequency (%)
02724
7.5%
42
 
< 0.1%
61
 
< 0.1%
102
 
< 0.1%
122
 
< 0.1%
142
 
< 0.1%
184
 
< 0.1%
206
 
< 0.1%
2210
 
< 0.1%
244
 
< 0.1%
ValueCountFrequency (%)
1341
 
< 0.1%
1243
 
< 0.1%
1222
 
< 0.1%
1209
 
< 0.1%
1186
 
< 0.1%
1168
 
< 0.1%
11415
< 0.1%
11213
 
< 0.1%
11026
0.1%
10837
0.1%

albumin
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct38
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.998571389
Minimum0
Maximum5.6
Zeros1915
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:20.192855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q14
median4.2
Q34.4
95-th percentile4.8
Maximum5.6
Range5.6
Interquartile range (IQR)0.4

Descriptive statistics

Standard deviation1.007912406
Coefficient of variation (CV)0.2520681283
Kurtosis10.17216305
Mean3.998571389
Median Absolute Deviation (MAD)0.2
Skewness-3.229492041
Sum144984.2
Variance1.015887418
MonotonicityNot monotonic
2022-05-08T17:23:20.293877image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
4.24103
11.3%
4.34040
11.1%
4.43697
10.2%
4.13644
10.0%
4.53082
8.5%
43081
8.5%
3.92295
 
6.3%
4.62271
 
6.3%
01915
 
5.3%
3.81637
 
4.5%
Other values (28)6494
17.9%
ValueCountFrequency (%)
01915
5.3%
1.21
 
< 0.1%
21
 
< 0.1%
2.12
 
< 0.1%
2.32
 
< 0.1%
2.44
 
< 0.1%
2.56
 
< 0.1%
2.619
 
0.1%
2.726
 
0.1%
2.837
 
0.1%
ValueCountFrequency (%)
5.61
 
< 0.1%
5.53
 
< 0.1%
5.49
 
< 0.1%
5.325
 
0.1%
5.262
 
0.2%
5.1128
 
0.4%
5284
 
0.8%
4.9505
 
1.4%
4.8970
2.7%
4.71574
4.3%

ALT
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct214
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.50555724
Minimum0
Maximum1363
Zeros1999
Zeros (%)5.5%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:20.410491image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q115
median20
Q327
95-th percentile51
Maximum1363
Range1363
Interquartile range (IQR)12

Descriptive statistics

Standard deviation20.12851563
Coefficient of variation (CV)0.8563300763
Kurtosis655.6707779
Mean23.50555724
Median Absolute Deviation (MAD)6
Skewness14.01527795
Sum852288
Variance405.1571413
MonotonicityNot monotonic
2022-05-08T17:23:20.529518image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01999
 
5.5%
171926
 
5.3%
161923
 
5.3%
151886
 
5.2%
181863
 
5.1%
191837
 
5.1%
141762
 
4.9%
201641
 
4.5%
211578
 
4.4%
131500
 
4.1%
Other values (204)18344
50.6%
ValueCountFrequency (%)
01999
5.5%
33
 
< 0.1%
42
 
< 0.1%
512
 
< 0.1%
631
 
0.1%
792
 
0.3%
8168
 
0.5%
9308
 
0.8%
10518
 
1.4%
11818
2.3%
ValueCountFrequency (%)
13631
< 0.1%
8191
< 0.1%
5361
< 0.1%
4431
< 0.1%
4201
< 0.1%
3931
< 0.1%
3431
< 0.1%
3291
< 0.1%
3191
< 0.1%
3171
< 0.1%

AST
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct193
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.904603
Minimum0
Maximum882
Zeros2016
Zeros (%)5.6%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:20.652545image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q118
median22
Q327
95-th percentile41
Maximum882
Range882
Interquartile range (IQR)9

Descriptive statistics

Standard deviation16.56345311
Coefficient of variation (CV)0.692898063
Kurtosis536.4162419
Mean23.904603
Median Absolute Deviation (MAD)4
Skewness14.58863626
Sum866757
Variance274.347979
MonotonicityNot monotonic
2022-05-08T17:23:20.766571image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
212489
 
6.9%
202462
 
6.8%
222391
 
6.6%
192316
 
6.4%
232250
 
6.2%
182087
 
5.8%
02016
 
5.6%
241955
 
5.4%
251793
 
4.9%
171772
 
4.9%
Other values (183)14728
40.6%
ValueCountFrequency (%)
02016
5.6%
73
 
< 0.1%
86
 
< 0.1%
911
 
< 0.1%
1043
 
0.1%
1182
 
0.2%
12174
 
0.5%
13346
 
1.0%
14576
 
1.6%
15932
2.6%
ValueCountFrequency (%)
8821
< 0.1%
8321
< 0.1%
7331
< 0.1%
5971
< 0.1%
4001
< 0.1%
3381
< 0.1%
3291
< 0.1%
3101
< 0.1%
3021
< 0.1%
2961
< 0.1%

ALP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct249
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67.26316225
Minimum0
Maximum907
Zeros1923
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:20.883597image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q153
median66
Q381
95-th percentile111
Maximum907
Range907
Interquartile range (IQR)28

Descriptive statistics

Standard deviation29.46757058
Coefficient of variation (CV)0.438093744
Kurtosis45.84184314
Mean67.26316225
Median Absolute Deviation (MAD)14
Skewness2.411201057
Sum2438895
Variance868.337716
MonotonicityNot monotonic
2022-05-08T17:23:20.997623image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01923
 
5.3%
61754
 
2.1%
65746
 
2.1%
59719
 
2.0%
63711
 
2.0%
66704
 
1.9%
67703
 
1.9%
62702
 
1.9%
64699
 
1.9%
55696
 
1.9%
Other values (239)27902
77.0%
ValueCountFrequency (%)
01923
5.3%
71
 
< 0.1%
91
 
< 0.1%
144
 
< 0.1%
151
 
< 0.1%
163
 
< 0.1%
181
 
< 0.1%
203
 
< 0.1%
213
 
< 0.1%
2210
 
< 0.1%
ValueCountFrequency (%)
9071
< 0.1%
7291
< 0.1%
6461
< 0.1%
6381
< 0.1%
6331
< 0.1%
6261
< 0.1%
3841
< 0.1%
3771
< 0.1%
3521
< 0.1%
3491
< 0.1%

BUN
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct78
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.79351333
Minimum0
Maximum98
Zeros1921
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:21.114649image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q19
median12
Q316
95-th percentile23
Maximum98
Range98
Interquartile range (IQR)7

Descriptive statistics

Standard deviation6.548530698
Coefficient of variation (CV)0.5118633581
Kurtosis11.56809769
Mean12.79351333
Median Absolute Deviation (MAD)3
Skewness1.746549097
Sum463880
Variance42.8832543
MonotonicityNot monotonic
2022-05-08T17:23:21.230675image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
113209
 
8.9%
123155
 
8.7%
133000
 
8.3%
102968
 
8.2%
92597
 
7.2%
142558
 
7.1%
152293
 
6.3%
81981
 
5.5%
161924
 
5.3%
01921
 
5.3%
Other values (68)10653
29.4%
ValueCountFrequency (%)
01921
5.3%
15
 
< 0.1%
230
 
0.1%
398
 
0.3%
4228
 
0.6%
5477
 
1.3%
6875
 
2.4%
71419
3.9%
81981
5.5%
92597
7.2%
ValueCountFrequency (%)
981
< 0.1%
961
< 0.1%
952
< 0.1%
901
< 0.1%
861
< 0.1%
811
< 0.1%
791
< 0.1%
741
< 0.1%
732
< 0.1%
722
< 0.1%

calcium
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct54
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.900228909
Minimum0
Maximum14.8
Zeros1950
Zeros (%)5.4%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:21.347702image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q19.1
median9.4
Q39.6
95-th percentile10
Maximum14.8
Range14.8
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation2.152408788
Coefficient of variation (CV)0.241837464
Kurtosis12.74437859
Mean8.900228909
Median Absolute Deviation (MAD)0.2
Skewness-3.768865346
Sum322713.4
Variance4.632863591
MonotonicityNot monotonic
2022-05-08T17:23:21.460726image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9.44012
11.1%
9.33785
10.4%
9.53741
10.3%
9.23419
9.4%
9.63296
9.1%
9.12771
7.6%
9.72577
 
7.1%
91994
 
5.5%
01950
 
5.4%
9.81935
 
5.3%
Other values (44)6779
18.7%
ValueCountFrequency (%)
01950
5.4%
6.41
 
< 0.1%
6.51
 
< 0.1%
6.61
 
< 0.1%
6.92
 
< 0.1%
71
 
< 0.1%
7.21
 
< 0.1%
7.31
 
< 0.1%
7.51
 
< 0.1%
7.62
 
< 0.1%
ValueCountFrequency (%)
14.81
 
< 0.1%
12.71
 
< 0.1%
12.12
 
< 0.1%
122
 
< 0.1%
11.71
 
< 0.1%
11.54
 
< 0.1%
11.43
 
< 0.1%
11.33
 
< 0.1%
11.24
 
< 0.1%
11.110
< 0.1%

CO2
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct29
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.6839957
Minimum0
Maximum43
Zeros1992
Zeros (%)5.5%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:21.571750image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q123
median25
Q326
95-th percentile29
Maximum43
Range43
Interquartile range (IQR)3

Descriptive statistics

Standard deviation6.140146267
Coefficient of variation (CV)0.2592529717
Kurtosis9.2755981
Mean23.6839957
Median Absolute Deviation (MAD)2
Skewness-3.054631438
Sum858758
Variance37.70139618
MonotonicityNot monotonic
2022-05-08T17:23:21.662771image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
256108
16.8%
265718
15.8%
245269
14.5%
274354
12.0%
233749
10.3%
282622
7.2%
222285
 
6.3%
01992
 
5.5%
291215
 
3.4%
211200
 
3.3%
Other values (19)1747
 
4.8%
ValueCountFrequency (%)
01992
5.5%
101
 
< 0.1%
131
 
< 0.1%
141
 
< 0.1%
154
 
< 0.1%
1614
 
< 0.1%
1723
 
0.1%
1899
 
0.3%
19251
 
0.7%
20538
 
1.5%
ValueCountFrequency (%)
431
 
< 0.1%
401
 
< 0.1%
384
 
< 0.1%
373
 
< 0.1%
3510
 
< 0.1%
3416
 
< 0.1%
3323
 
0.1%
3266
 
0.2%
31164
 
0.5%
30527
1.5%

creatinine
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct369
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8549488403
Minimum0
Maximum17.8
Zeros1917
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:21.768795image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.7
median0.83
Q31
95-th percentile1.31
Maximum17.8
Range17.8
Interquartile range (IQR)0.3

Descriptive statistics

Standard deviation0.4745326014
Coefficient of variation (CV)0.555042102
Kurtosis239.3340584
Mean0.8549488403
Median Absolute Deviation (MAD)0.15
Skewness10.27375417
Sum30999.59
Variance0.2251811898
MonotonicityNot monotonic
2022-05-08T17:23:21.884821image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01917
 
5.3%
0.81360
 
3.8%
0.91326
 
3.7%
0.71061
 
2.9%
11052
 
2.9%
0.82984
 
2.7%
0.72920
 
2.5%
0.92844
 
2.3%
1.1706
 
1.9%
0.6650
 
1.8%
Other values (359)25439
70.2%
ValueCountFrequency (%)
01917
5.3%
0.161
 
< 0.1%
0.251
 
< 0.1%
0.291
 
< 0.1%
0.33
 
< 0.1%
0.312
 
< 0.1%
0.326
 
< 0.1%
0.335
 
< 0.1%
0.344
 
< 0.1%
0.353
 
< 0.1%
ValueCountFrequency (%)
17.81
< 0.1%
17.411
< 0.1%
16.641
< 0.1%
15.661
< 0.1%
12.741
< 0.1%
12.481
< 0.1%
10.981
< 0.1%
10.351
< 0.1%
10.221
< 0.1%
9.661
< 0.1%

GGT
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct382
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.50015169
Minimum0
Maximum1681
Zeros1923
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:22.008435image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q113
median19
Q329
95-th percentile70
Maximum1681
Range1681
Interquartile range (IQR)16

Descriptive statistics

Standard deviation41.31718983
Coefficient of variation (CV)1.502434979
Kurtosis288.9590301
Mean27.50015169
Median Absolute Deviation (MAD)7
Skewness12.49961296
Sum997128
Variance1707.110176
MonotonicityNot monotonic
2022-05-08T17:23:22.125461image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01923
 
5.3%
151663
 
4.6%
141643
 
4.5%
131604
 
4.4%
161526
 
4.2%
121524
 
4.2%
171518
 
4.2%
181446
 
4.0%
111351
 
3.7%
191322
 
3.6%
Other values (372)20739
57.2%
ValueCountFrequency (%)
01923
5.3%
21
 
< 0.1%
31
 
< 0.1%
417
 
< 0.1%
5128
 
0.4%
6210
 
0.6%
7367
 
1.0%
8617
 
1.7%
9878
2.4%
101133
3.1%
ValueCountFrequency (%)
16811
< 0.1%
14791
< 0.1%
13631
< 0.1%
11971
< 0.1%
11921
< 0.1%
11361
< 0.1%
11161
< 0.1%
10611
< 0.1%
10121
< 0.1%
9081
< 0.1%

glucose
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct403
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean96.82412642
Minimum0
Maximum777
Zeros1917
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:22.250490image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q184
median92
Q3103
95-th percentile160
Maximum777
Range777
Interquartile range (IQR)19

Descriptive statistics

Standard deviation44.27838356
Coefficient of variation (CV)0.4573073386
Kurtosis19.40296772
Mean96.82412642
Median Absolute Deviation (MAD)9
Skewness2.713582949
Sum3510746
Variance1960.575251
MonotonicityNot monotonic
2022-05-08T17:23:22.370516image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01917
 
5.3%
881186
 
3.3%
901177
 
3.2%
891172
 
3.2%
911162
 
3.2%
871152
 
3.2%
921128
 
3.1%
861076
 
3.0%
931068
 
2.9%
851033
 
2.8%
Other values (393)24188
66.7%
ValueCountFrequency (%)
01917
5.3%
191
 
< 0.1%
331
 
< 0.1%
341
 
< 0.1%
351
 
< 0.1%
381
 
< 0.1%
401
 
< 0.1%
412
 
< 0.1%
421
 
< 0.1%
433
 
< 0.1%
ValueCountFrequency (%)
7771
< 0.1%
6261
< 0.1%
6171
< 0.1%
6101
< 0.1%
6051
< 0.1%
5771
< 0.1%
5681
< 0.1%
5671
< 0.1%
5591
< 0.1%
5541
< 0.1%

iron
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct285
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean80.29352712
Minimum0
Maximum476
Zeros1955
Zeros (%)5.4%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:22.493544image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q157
median78
Q3103
95-th percentile147
Maximum476
Range476
Interquartile range (IQR)46

Descriptive statistics

Standard deviation39.7171105
Coefficient of variation (CV)0.4946489701
Kurtosis1.946349957
Mean80.29352712
Median Absolute Deviation (MAD)23
Skewness0.5187356693
Sum2911363
Variance1577.448866
MonotonicityNot monotonic
2022-05-08T17:23:22.832619image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01955
 
5.4%
68467
 
1.3%
79455
 
1.3%
63452
 
1.2%
76447
 
1.2%
71439
 
1.2%
67439
 
1.2%
87433
 
1.2%
84432
 
1.2%
81432
 
1.2%
Other values (275)30308
83.6%
ValueCountFrequency (%)
01955
5.4%
21
 
< 0.1%
51
 
< 0.1%
62
 
< 0.1%
78
 
< 0.1%
85
 
< 0.1%
97
 
< 0.1%
1021
 
0.1%
1119
 
0.1%
1215
 
< 0.1%
ValueCountFrequency (%)
4761
< 0.1%
4281
< 0.1%
3871
< 0.1%
3821
< 0.1%
3431
< 0.1%
3251
< 0.1%
3152
< 0.1%
3021
< 0.1%
3001
< 0.1%
2991
< 0.1%

LHD
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct307
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean125.8677846
Minimum0
Maximum1539
Zeros2106
Zeros (%)5.8%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:22.947646image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1110
median127
Q3147
95-th percentile184
Maximum1539
Range1539
Interquartile range (IQR)37

Descriptive statistics

Standard deviation44.8916452
Coefficient of variation (CV)0.3566571491
Kurtosis52.83403444
Mean125.8677846
Median Absolute Deviation (MAD)18
Skewness1.383721903
Sum4563840
Variance2015.259809
MonotonicityNot monotonic
2022-05-08T17:23:23.064672image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02106
 
5.8%
121609
 
1.7%
120594
 
1.6%
122570
 
1.6%
123569
 
1.6%
131565
 
1.6%
125546
 
1.5%
119527
 
1.5%
118526
 
1.5%
134523
 
1.4%
Other values (297)29124
80.3%
ValueCountFrequency (%)
02106
5.8%
41
 
< 0.1%
321
 
< 0.1%
371
 
< 0.1%
381
 
< 0.1%
391
 
< 0.1%
462
 
< 0.1%
503
 
< 0.1%
513
 
< 0.1%
525
 
< 0.1%
ValueCountFrequency (%)
15391
< 0.1%
12741
< 0.1%
10921
< 0.1%
7791
< 0.1%
7591
< 0.1%
7261
< 0.1%
6731
< 0.1%
5811
< 0.1%
5771
< 0.1%
5521
< 0.1%

phosphorus
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct65
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.54431617
Minimum0
Maximum10.9
Zeros1923
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:23.193297image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13.3
median3.7
Q34.1
95-th percentile4.7
Maximum10.9
Range10.9
Interquartile range (IQR)0.8

Descriptive statistics

Standard deviation1.006365554
Coefficient of variation (CV)0.2839378617
Kurtosis5.797482153
Mean3.54431617
Median Absolute Deviation (MAD)0.4
Skewness-2.059543844
Sum128513.36
Variance1.012771629
MonotonicityNot monotonic
2022-05-08T17:23:23.312324image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.62534
 
7.0%
3.82501
 
6.9%
3.72486
 
6.9%
3.52358
 
6.5%
3.92313
 
6.4%
3.42153
 
5.9%
42149
 
5.9%
01923
 
5.3%
4.11886
 
5.2%
3.31876
 
5.2%
Other values (55)14080
38.8%
ValueCountFrequency (%)
01923
5.3%
11
 
< 0.1%
1.51
 
< 0.1%
1.61
 
< 0.1%
1.72
 
< 0.1%
1.89
 
< 0.1%
1.99
 
< 0.1%
217
 
< 0.1%
2.124
 
0.1%
2.246
 
0.1%
ValueCountFrequency (%)
10.91
 
< 0.1%
9.61
 
< 0.1%
8.91
 
< 0.1%
8.21
 
< 0.1%
8.12
< 0.1%
7.61
 
< 0.1%
7.52
< 0.1%
7.22
< 0.1%
6.91
 
< 0.1%
6.73
< 0.1%

bilirubin
Real number (ℝ≥0)

ZEROS

Distinct56
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6249670427
Minimum0
Maximum13.1
Zeros1939
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:23.429350image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.4
median0.6
Q30.8
95-th percentile1.2
Maximum13.1
Range13.1
Interquartile range (IQR)0.4

Descriptive statistics

Standard deviation0.346267444
Coefficient of variation (CV)0.5540571269
Kurtosis57.87652832
Mean0.6249670427
Median Absolute Deviation (MAD)0.2
Skewness2.71240802
Sum22660.68
Variance0.1199011428
MonotonicityNot monotonic
2022-05-08T17:23:23.549376image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.65511
15.2%
0.55235
14.4%
0.74749
13.1%
0.43974
11.0%
0.83673
10.1%
0.32525
7.0%
0.92340
6.5%
01939
 
5.3%
11575
 
4.3%
0.21342
 
3.7%
Other values (46)3396
9.4%
ValueCountFrequency (%)
01939
5.3%
5.397605347 × 10-7919
 
0.1%
0.014
 
< 0.1%
0.024
 
< 0.1%
0.031
 
< 0.1%
0.042
 
< 0.1%
0.054
 
< 0.1%
0.062
 
< 0.1%
0.073
 
< 0.1%
0.084
 
< 0.1%
ValueCountFrequency (%)
13.11
< 0.1%
7.31
< 0.1%
7.11
< 0.1%
4.41
< 0.1%
4.11
< 0.1%
3.91
< 0.1%
3.62
< 0.1%
3.32
< 0.1%
3.22
< 0.1%
3.12
< 0.1%

total_protein
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct59
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.762011638
Minimum0
Maximum11.3
Zeros1966
Zeros (%)5.4%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:23.668403image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q16.8
median7.1
Q37.4
95-th percentile7.9
Maximum11.3
Range11.3
Interquartile range (IQR)0.6

Descriptive statistics

Standard deviation1.68397923
Coefficient of variation (CV)0.2490352457
Kurtosis11.15567277
Mean6.762011638
Median Absolute Deviation (MAD)0.3
Skewness-3.443630819
Sum245183.78
Variance2.835786046
MonotonicityNot monotonic
2022-05-08T17:23:23.783429image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.23067
 
8.5%
7.12983
 
8.2%
72961
 
8.2%
7.32755
 
7.6%
6.92654
 
7.3%
7.42511
 
6.9%
6.82259
 
6.2%
7.52177
 
6.0%
01966
 
5.4%
6.71826
 
5.0%
Other values (49)11100
30.6%
ValueCountFrequency (%)
01966
5.4%
3.41
 
< 0.1%
4.72
 
< 0.1%
4.91
 
< 0.1%
5.11
 
< 0.1%
5.22
 
< 0.1%
5.311
 
< 0.1%
5.381
 
< 0.1%
5.46
 
< 0.1%
5.522
 
0.1%
ValueCountFrequency (%)
11.31
 
< 0.1%
10.91
 
< 0.1%
10.81
 
< 0.1%
10.41
 
< 0.1%
10.31
 
< 0.1%
10.13
< 0.1%
101
 
< 0.1%
9.91
 
< 0.1%
9.71
 
< 0.1%
9.52
< 0.1%

uric_acid
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct124
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.152086379
Minimum0
Maximum18
Zeros1927
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:23.904456image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q14.2
median5.2
Q36.3
95-th percentile7.9
Maximum18
Range18
Interquartile range (IQR)2.1

Descriptive statistics

Standard deviation1.862481096
Coefficient of variation (CV)0.361500363
Kurtosis1.665349375
Mean5.152086379
Median Absolute Deviation (MAD)1
Skewness-0.6094062417
Sum186809.5
Variance3.468835834
MonotonicityNot monotonic
2022-05-08T17:23:24.018482image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01927
 
5.3%
5.3985
 
2.7%
5.2982
 
2.7%
4.9954
 
2.6%
4.6951
 
2.6%
5.4951
 
2.6%
5.5943
 
2.6%
4.8911
 
2.5%
5910
 
2.5%
5.6907
 
2.5%
Other values (114)25838
71.3%
ValueCountFrequency (%)
01927
5.3%
0.41
 
< 0.1%
0.51
 
< 0.1%
0.71
 
< 0.1%
0.82
 
< 0.1%
1.11
 
< 0.1%
1.21
 
< 0.1%
1.41
 
< 0.1%
1.51
 
< 0.1%
1.64
 
< 0.1%
ValueCountFrequency (%)
181
< 0.1%
17.61
< 0.1%
15.11
< 0.1%
13.71
< 0.1%
13.32
< 0.1%
13.11
< 0.1%
131
< 0.1%
12.42
< 0.1%
12.32
< 0.1%
12.22
< 0.1%

sodium
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct41
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean131.9389393
Minimum0
Maximum161
Zeros1918
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:24.138105image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1138
median139
Q3141
95-th percentile143
Maximum161
Range161
Interquartile range (IQR)3

Descriptive statistics

Standard deviation31.26932909
Coefficient of variation (CV)0.2369984878
Kurtosis13.7746747
Mean131.9389393
Median Absolute Deviation (MAD)2
Skewness-3.957921588
Sum4783974
Variance977.7709419
MonotonicityNot monotonic
2022-05-08T17:23:24.242129image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
1396360
17.5%
1406191
17.1%
1385065
14.0%
1414727
13.0%
1373190
8.8%
1422823
7.8%
01918
 
5.3%
1361734
 
4.8%
1431420
 
3.9%
135855
 
2.4%
Other values (31)1976
 
5.4%
ValueCountFrequency (%)
01918
5.3%
991
 
< 0.1%
1071
 
< 0.1%
1141
 
< 0.1%
1171
 
< 0.1%
1192
 
< 0.1%
1201
 
< 0.1%
1212
 
< 0.1%
1232
 
< 0.1%
1246
 
< 0.1%
ValueCountFrequency (%)
1611
 
< 0.1%
1602
 
< 0.1%
1571
 
< 0.1%
1541
 
< 0.1%
1531
 
< 0.1%
1514
 
< 0.1%
1502
 
< 0.1%
1495
 
< 0.1%
14819
0.1%
14746
0.1%

potassium
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct243
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.774180479
Minimum0
Maximum7.3
Zeros1924
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:24.359154image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13.7
median3.92
Q34.2
95-th percentile4.6
Maximum7.3
Range7.3
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation0.9553653807
Coefficient of variation (CV)0.2531318748
Kurtosis10.06145959
Mean3.774180479
Median Absolute Deviation (MAD)0.22
Skewness-3.155975666
Sum136848.01
Variance0.9127230107
MonotonicityNot monotonic
2022-05-08T17:23:24.475180image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
43766
 
10.4%
3.93738
 
10.3%
3.83397
 
9.4%
4.13360
 
9.3%
4.22732
 
7.5%
3.72641
 
7.3%
4.32009
 
5.5%
01924
 
5.3%
3.61850
 
5.1%
4.41395
 
3.8%
Other values (233)9447
26.1%
ValueCountFrequency (%)
01924
5.3%
2.31
 
< 0.1%
2.41
 
< 0.1%
2.51
 
< 0.1%
2.62
 
< 0.1%
2.631
 
< 0.1%
2.661
 
< 0.1%
2.71
 
< 0.1%
2.815
 
< 0.1%
2.821
 
< 0.1%
ValueCountFrequency (%)
7.31
 
< 0.1%
6.61
 
< 0.1%
63
 
< 0.1%
5.94
 
< 0.1%
5.862
 
< 0.1%
5.83
 
< 0.1%
5.74
 
< 0.1%
5.610
< 0.1%
5.521
 
< 0.1%
5.511
 
< 0.1%

chloride
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct41
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean97.97426846
Minimum0
Maximum120
Zeros1918
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:24.591810image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1101
median103
Q3105
95-th percentile108
Maximum120
Range120
Interquartile range (IQR)4

Descriptive statistics

Standard deviation23.3412209
Coefficient of variation (CV)0.2382382769
Kurtosis13.43178487
Mean97.97426846
Median Absolute Deviation (MAD)2
Skewness-3.890010881
Sum3552449
Variance544.8125929
MonotonicityNot monotonic
2022-05-08T17:23:24.702835image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
1045061
14.0%
1054558
12.6%
1034534
12.5%
1023780
10.4%
1063559
9.8%
1012908
8.0%
1072398
6.6%
1001935
 
5.3%
01918
 
5.3%
1081360
 
3.8%
Other values (31)4248
11.7%
ValueCountFrequency (%)
01918
5.3%
731
 
< 0.1%
791
 
< 0.1%
821
 
< 0.1%
831
 
< 0.1%
842
 
< 0.1%
862
 
< 0.1%
874
 
< 0.1%
886
 
< 0.1%
896
 
< 0.1%
ValueCountFrequency (%)
1201
 
< 0.1%
1192
 
< 0.1%
1182
 
< 0.1%
1173
 
< 0.1%
1162
 
< 0.1%
1155
 
< 0.1%
1148
 
< 0.1%
11319
 
0.1%
11234
 
0.1%
111109
0.3%

osmolality
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct78
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean263.7606387
Minimum0
Maximum323
Zeros1923
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:24.818861image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1275
median278
Q3281
95-th percentile287
Maximum323
Range323
Interquartile range (IQR)6

Descriptive statistics

Standard deviation62.64001176
Coefficient of variation (CV)0.2374880955
Kurtosis13.68097084
Mean263.7606387
Median Absolute Deviation (MAD)3
Skewness-3.942406539
Sum9563697
Variance3923.771074
MonotonicityNot monotonic
2022-05-08T17:23:24.933887image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2782930
 
8.1%
2772873
 
7.9%
2792770
 
7.6%
2802642
 
7.3%
2762619
 
7.2%
2812297
 
6.3%
2752247
 
6.2%
2822065
 
5.7%
01923
 
5.3%
2741875
 
5.2%
Other values (68)12018
33.1%
ValueCountFrequency (%)
01923
5.3%
2011
 
< 0.1%
2151
 
< 0.1%
2281
 
< 0.1%
2351
 
< 0.1%
2371
 
< 0.1%
2411
 
< 0.1%
2431
 
< 0.1%
2441
 
< 0.1%
2463
 
< 0.1%
ValueCountFrequency (%)
3231
< 0.1%
3221
< 0.1%
3211
< 0.1%
3152
< 0.1%
3141
< 0.1%
3131
< 0.1%
3101
< 0.1%
3091
< 0.1%
3082
< 0.1%
3072
< 0.1%

globulin
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct62
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.769648915
Minimum0
Maximum7.5
Zeros1967
Zeros (%)5.4%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:25.055913image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12.6
median2.9
Q33.2
95-th percentile3.7
Maximum7.5
Range7.5
Interquartile range (IQR)0.6

Descriptive statistics

Standard deviation0.7989862818
Coefficient of variation (CV)0.2884792645
Kurtosis5.644149082
Mean2.769648915
Median Absolute Deviation (MAD)0.3
Skewness-1.921444607
Sum100424.7
Variance0.6383790786
MonotonicityNot monotonic
2022-05-08T17:23:25.169939image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.83340
 
9.2%
2.93146
 
8.7%
2.73123
 
8.6%
33037
 
8.4%
2.62703
 
7.5%
3.12662
 
7.3%
2.52248
 
6.2%
3.22213
 
6.1%
01967
 
5.4%
3.31851
 
5.1%
Other values (52)9969
27.5%
ValueCountFrequency (%)
01967
5.4%
0.71
 
< 0.1%
0.81
 
< 0.1%
14
 
< 0.1%
1.21
 
< 0.1%
1.33
 
< 0.1%
1.42
 
< 0.1%
1.55
 
< 0.1%
1.610
 
< 0.1%
1.718
 
< 0.1%
ValueCountFrequency (%)
7.51
 
< 0.1%
7.21
 
< 0.1%
7.11
 
< 0.1%
71
 
< 0.1%
6.71
 
< 0.1%
6.61
 
< 0.1%
6.51
 
< 0.1%
6.31
 
< 0.1%
6.23
< 0.1%
64
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
27243 
1.0
9016 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row1.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.027243
75.1%
1.09016
 
24.9%

Length

2022-05-08T17:23:25.275963image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:25.363983image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.027243
75.1%
1.09016
 
24.9%

Most occurring characters

ValueCountFrequency (%)
063502
58.4%
.36259
33.3%
19016
 
8.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
063502
87.6%
19016
 
12.4%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
063502
58.4%
.36259
33.3%
19016
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
063502
58.4%
.36259
33.3%
19016
 
8.3%

sleep_hours
Real number (ℝ≥0)

Distinct28
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.071830442
Minimum0
Maximum14.5
Zeros97
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:25.624040image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4.5
Q16
median7
Q38
95-th percentile9.5
Maximum14.5
Range14.5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.57052537
Coefficient of variation (CV)0.2220818758
Kurtosis1.658298951
Mean7.071830442
Median Absolute Deviation (MAD)1
Skewness-0.1881441416
Sum256417.5
Variance2.466549937
MonotonicityNot monotonic
2022-05-08T17:23:25.714061image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
88941
24.7%
78553
23.6%
66868
18.9%
52855
 
7.9%
92503
 
6.9%
41223
 
3.4%
7.51026
 
2.8%
101005
 
2.8%
8.5763
 
2.1%
6.5645
 
1.8%
Other values (18)1877
 
5.2%
ValueCountFrequency (%)
097
 
0.3%
115
 
< 0.1%
2101
 
0.3%
2.51
 
< 0.1%
3319
 
0.9%
3.531
 
0.1%
41223
3.4%
4.592
 
0.3%
52855
7.9%
5.5233
 
0.6%
ValueCountFrequency (%)
14.52
 
< 0.1%
1412
 
< 0.1%
13.52
 
< 0.1%
1324
 
0.1%
12.56
 
< 0.1%
12228
 
0.6%
11.534
 
0.1%
11282
 
0.8%
10.592
 
0.3%
101005
2.8%

vigorous_recreation
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
No
27318 
Yes
8938 
Missing
 
3

Length

Max length7
Median length2
Mean length2.246918007
Min length2

Characters and Unicode

Total characters81471
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowYes
5th rowYes

Common Values

ValueCountFrequency (%)
No27318
75.3%
Yes8938
 
24.7%
Missing3
 
< 0.1%

Length

2022-05-08T17:23:25.811082image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:25.905700image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
no27318
75.3%
yes8938
 
24.7%
missing3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N27318
33.5%
o27318
33.5%
s8944
 
11.0%
Y8938
 
11.0%
e8938
 
11.0%
i6
 
< 0.1%
M3
 
< 0.1%
n3
 
< 0.1%
g3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter45212
55.5%
Uppercase Letter36259
44.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o27318
60.4%
s8944
 
19.8%
e8938
 
19.8%
i6
 
< 0.1%
n3
 
< 0.1%
g3
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N27318
75.3%
Y8938
 
24.7%
M3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin81471
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N27318
33.5%
o27318
33.5%
s8944
 
11.0%
Y8938
 
11.0%
e8938
 
11.0%
i6
 
< 0.1%
M3
 
< 0.1%
n3
 
< 0.1%
g3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII81471
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N27318
33.5%
o27318
33.5%
s8944
 
11.0%
Y8938
 
11.0%
e8938
 
11.0%
i6
 
< 0.1%
M3
 
< 0.1%
n3
 
< 0.1%
g3
 
< 0.1%

moderate_recreation
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
No
20917 
Yes
15336 
Missing
 
6

Length

Max length7
Median length2
Mean length2.42378444
Min length2

Characters and Unicode

Total characters87884
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowYes
3rd rowYes
4th rowYes
5th rowNo

Common Values

ValueCountFrequency (%)
No20917
57.7%
Yes15336
42.3%
Missing6
 
< 0.1%

Length

2022-05-08T17:23:25.989719image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:26.081739image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
no20917
57.7%
yes15336
42.3%
missing6
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N20917
23.8%
o20917
23.8%
s15348
17.5%
Y15336
17.5%
e15336
17.5%
i12
 
< 0.1%
M6
 
< 0.1%
n6
 
< 0.1%
g6
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter51625
58.7%
Uppercase Letter36259
41.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o20917
40.5%
s15348
29.7%
e15336
29.7%
i12
 
< 0.1%
n6
 
< 0.1%
g6
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N20917
57.7%
Y15336
42.3%
M6
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin87884
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N20917
23.8%
o20917
23.8%
s15348
17.5%
Y15336
17.5%
e15336
17.5%
i12
 
< 0.1%
M6
 
< 0.1%
n6
 
< 0.1%
g6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII87884
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N20917
23.8%
o20917
23.8%
s15348
17.5%
Y15336
17.5%
e15336
17.5%
i12
 
< 0.1%
M6
 
< 0.1%
n6
 
< 0.1%
g6
 
< 0.1%

sedentary_time
Real number (ℝ≥0)

Distinct67
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean330.8408119
Minimum0
Maximum1320
Zeros224
Zeros (%)0.6%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:26.175371image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile60
Q1180
median300
Q3480
95-th percentile720
Maximum1320
Range1320
Interquartile range (IQR)300

Descriptive statistics

Standard deviation201.395358
Coefficient of variation (CV)0.6087379511
Kurtosis0.261350137
Mean330.8408119
Median Absolute Deviation (MAD)120
Skewness0.7583874284
Sum11995957
Variance40560.09022
MonotonicityNot monotonic
2022-05-08T17:23:26.291397image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2404617
12.7%
1804244
11.7%
4804189
11.6%
3003886
10.7%
1203749
10.3%
3603723
10.3%
6002509
6.9%
601700
 
4.7%
4201492
 
4.1%
7201458
 
4.0%
Other values (57)4692
12.9%
ValueCountFrequency (%)
0224
0.6%
112
 
< 0.1%
210
 
< 0.1%
38
 
< 0.1%
44
 
< 0.1%
59
 
< 0.1%
65
 
< 0.1%
72
 
< 0.1%
87
 
< 0.1%
91
 
< 0.1%
ValueCountFrequency (%)
13202
 
< 0.1%
120013
 
< 0.1%
11404
 
< 0.1%
108054
 
0.1%
102026
 
0.1%
960154
 
0.4%
900206
 
0.6%
840324
 
0.9%
780186
 
0.5%
7201458
4.0%

vigorous_work
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
29647 
1.0
6612 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.029647
81.8%
1.06612
 
18.2%

Length

2022-05-08T17:23:26.397421image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:26.490442image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.029647
81.8%
1.06612
 
18.2%

Most occurring characters

ValueCountFrequency (%)
065906
60.6%
.36259
33.3%
16612
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
065906
90.9%
16612
 
9.1%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
065906
60.6%
.36259
33.3%
16612
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
065906
60.6%
.36259
33.3%
16612
 
6.1%

moderate_work
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
23690 
1.0
12569 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row1.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.023690
65.3%
1.012569
34.7%

Length

2022-05-08T17:23:26.570459image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:26.662480image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.023690
65.3%
1.012569
34.7%

Most occurring characters

ValueCountFrequency (%)
059949
55.1%
.36259
33.3%
112569
 
11.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
059949
82.7%
112569
 
17.3%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
059949
55.1%
.36259
33.3%
112569
 
11.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
059949
55.1%
.36259
33.3%
112569
 
11.6%

drinks_per_occasion
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct32
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.597975675
Minimum0
Maximum83
Zeros15602
Zeros (%)43.0%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:26.747500image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile6
Maximum83
Range83
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.489864121
Coefficient of variation (CV)1.558136434
Kurtosis98.0071488
Mean1.597975675
Median Absolute Deviation (MAD)1
Skewness5.520727857
Sum57941
Variance6.199423339
MonotonicityNot monotonic
2022-05-08T17:23:26.840519image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
015602
43.0%
17235
20.0%
25696
 
15.7%
32960
 
8.2%
41560
 
4.3%
61060
 
2.9%
5906
 
2.5%
12294
 
0.8%
8285
 
0.8%
10227
 
0.6%
Other values (22)434
 
1.2%
ValueCountFrequency (%)
015602
43.0%
17235
20.0%
25696
 
15.7%
32960
 
8.2%
41560
 
4.3%
5906
 
2.5%
61060
 
2.9%
7197
 
0.5%
8285
 
0.8%
947
 
0.1%
ValueCountFrequency (%)
831
 
< 0.1%
821
 
< 0.1%
661
 
< 0.1%
641
 
< 0.1%
361
 
< 0.1%
321
 
< 0.1%
304
 
< 0.1%
252
 
< 0.1%
2413
< 0.1%
232
 
< 0.1%

lifetime_alcohol_consumption
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
Missing
23641 
Yes
8009 
No
4609 

Length

Max length7
Median length7
Mean length5.480901293
Min length2

Characters and Unicode

Total characters198732
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMissing
2nd rowMissing
3rd rowMissing
4th rowMissing
5th rowMissing

Common Values

ValueCountFrequency (%)
Missing23641
65.2%
Yes8009
 
22.1%
No4609
 
12.7%

Length

2022-05-08T17:23:26.931541image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:27.022560image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
missing23641
65.2%
yes8009
 
22.1%
no4609
 
12.7%

Most occurring characters

ValueCountFrequency (%)
s55291
27.8%
i47282
23.8%
M23641
11.9%
n23641
11.9%
g23641
11.9%
Y8009
 
4.0%
e8009
 
4.0%
N4609
 
2.3%
o4609
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter162473
81.8%
Uppercase Letter36259
 
18.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s55291
34.0%
i47282
29.1%
n23641
14.6%
g23641
14.6%
e8009
 
4.9%
o4609
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
M23641
65.2%
Y8009
 
22.1%
N4609
 
12.7%

Most occurring scripts

ValueCountFrequency (%)
Latin198732
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s55291
27.8%
i47282
23.8%
M23641
11.9%
n23641
11.9%
g23641
11.9%
Y8009
 
4.0%
e8009
 
4.0%
N4609
 
2.3%
o4609
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII198732
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s55291
27.8%
i47282
23.8%
M23641
11.9%
n23641
11.9%
g23641
11.9%
Y8009
 
4.0%
e8009
 
4.0%
N4609
 
2.3%
o4609
 
2.3%

drinks_past_year
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct94
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean42.30836482
Minimum0
Maximum365
Zeros15602
Zeros (%)43.0%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:27.120582image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q348
95-th percentile260
Maximum365
Range365
Interquartile range (IQR)48

Descriptive statistics

Standard deviation83.99128606
Coefficient of variation (CV)1.985217023
Kurtosis5.997617689
Mean42.30836482
Median Absolute Deviation (MAD)2
Skewness2.553895358
Sum1534059
Variance7054.536135
MonotonicityNot monotonic
2022-05-08T17:23:27.238196image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
015602
43.0%
522383
 
6.6%
122281
 
6.3%
1041984
 
5.5%
21556
 
4.3%
241537
 
4.2%
1561115
 
3.1%
51002
 
2.8%
11000
 
2.8%
364976
 
2.7%
Other values (84)6823
18.8%
ValueCountFrequency (%)
015602
43.0%
11000
 
2.8%
21556
 
4.3%
3820
 
2.3%
4568
 
1.6%
51002
 
2.8%
6477
 
1.3%
7132
 
0.4%
8102
 
0.3%
9313
 
0.9%
ValueCountFrequency (%)
365151
 
0.4%
364976
2.7%
3609
 
< 0.1%
3504
 
< 0.1%
3481
 
< 0.1%
3401
 
< 0.1%
3362
 
< 0.1%
3341
 
< 0.1%
3302
 
< 0.1%
3241
 
< 0.1%

cant_work
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
31342 
1.0
4917 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.031342
86.4%
1.04917
 
13.6%

Length

2022-05-08T17:23:27.347222image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:27.437241image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.031342
86.4%
1.04917
 
13.6%

Most occurring characters

ValueCountFrequency (%)
067601
62.1%
.36259
33.3%
14917
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
067601
93.2%
14917
 
6.8%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
067601
62.1%
.36259
33.3%
14917
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
067601
62.1%
.36259
33.3%
14917
 
4.5%

limited_work
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
28893 
1.0
7366 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.028893
79.7%
1.07366
 
20.3%

Length

2022-05-08T17:23:27.513259image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:27.601278image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.028893
79.7%
1.07366
 
20.3%

Most occurring characters

ValueCountFrequency (%)
065152
59.9%
.36259
33.3%
17366
 
6.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
065152
89.8%
17366
 
10.2%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
065152
59.9%
.36259
33.3%
17366
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
065152
59.9%
.36259
33.3%
17366
 
6.8%

walking_equipment
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
32829 
1.0
3430 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.032829
90.5%
1.03430
 
9.5%

Length

2022-05-08T17:23:27.677295image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:27.766315image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.032829
90.5%
1.03430
 
9.5%

Most occurring characters

ValueCountFrequency (%)
069088
63.5%
.36259
33.3%
13430
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
069088
95.3%
13430
 
4.7%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
069088
63.5%
.36259
33.3%
13430
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
069088
63.5%
.36259
33.3%
13430
 
3.2%

memory_problems
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
33523 
1.0
 
2736

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.033523
92.5%
1.02736
 
7.5%

Length

2022-05-08T17:23:27.842331image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:27.929351image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.033523
92.5%
1.02736
 
7.5%

Most occurring characters

ValueCountFrequency (%)
069782
64.2%
.36259
33.3%
12736
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
069782
96.2%
12736
 
3.8%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
069782
64.2%
.36259
33.3%
12736
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
069782
64.2%
.36259
33.3%
12736
 
2.5%

limitations
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35540 
1.0
 
719

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035540
98.0%
1.0719
 
2.0%

Length

2022-05-08T17:23:28.005368image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:28.093388image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035540
98.0%
1.0719
 
2.0%

Most occurring characters

ValueCountFrequency (%)
071799
66.0%
.36259
33.3%
1719
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071799
99.0%
1719
 
1.0%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071799
66.0%
.36259
33.3%
1719
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071799
66.0%
.36259
33.3%
1719
 
0.7%

healthcare_equipment
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
33078 
1.0
 
3181

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.033078
91.2%
1.03181
 
8.8%

Length

2022-05-08T17:23:28.166404image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:28.255023image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.033078
91.2%
1.03181
 
8.8%

Most occurring characters

ValueCountFrequency (%)
069337
63.7%
.36259
33.3%
13181
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
069337
95.6%
13181
 
4.4%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
069337
63.7%
.36259
33.3%
13181
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
069337
63.7%
.36259
33.3%
13181
 
2.9%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
33573 
1.0
 
2686

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.033573
92.6%
1.02686
 
7.4%

Length

2022-05-08T17:23:28.331040image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:28.418060image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.033573
92.6%
1.02686
 
7.4%

Most occurring characters

ValueCountFrequency (%)
069832
64.2%
.36259
33.3%
12686
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
069832
96.3%
12686
 
3.7%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
069832
64.2%
.36259
33.3%
12686
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
069832
64.2%
.36259
33.3%
12686
 
2.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
34034 
1.0
 
2225

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.034034
93.9%
1.02225
 
6.1%

Length

2022-05-08T17:23:28.494077image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:28.580688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.034034
93.9%
1.02225
 
6.1%

Most occurring characters

ValueCountFrequency (%)
070293
64.6%
.36259
33.3%
12225
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
070293
96.9%
12225
 
3.1%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
070293
64.6%
.36259
33.3%
12225
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
070293
64.6%
.36259
33.3%
12225
 
2.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35067 
1.0
 
1192

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035067
96.7%
1.01192
 
3.3%

Length

2022-05-08T17:23:28.656705image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:28.928765image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035067
96.7%
1.01192
 
3.3%

Most occurring characters

ValueCountFrequency (%)
071326
65.6%
.36259
33.3%
11192
 
1.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071326
98.4%
11192
 
1.6%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071326
65.6%
.36259
33.3%
11192
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071326
65.6%
.36259
33.3%
11192
 
1.1%

health_problem_Back or Neck
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
31569 
1.0
4690 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.031569
87.1%
1.04690
 
12.9%

Length

2022-05-08T17:23:29.003783image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:29.095803image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.031569
87.1%
1.04690
 
12.9%

Most occurring characters

ValueCountFrequency (%)
067828
62.4%
.36259
33.3%
14690
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
067828
93.5%
14690
 
6.5%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
067828
62.4%
.36259
33.3%
14690
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
067828
62.4%
.36259
33.3%
14690
 
4.3%

health_problem_Arthritis
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
31324 
1.0
4935 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.031324
86.4%
1.04935
 
13.6%

Length

2022-05-08T17:23:29.171819image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:29.260418image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.031324
86.4%
1.04935
 
13.6%

Most occurring characters

ValueCountFrequency (%)
067583
62.1%
.36259
33.3%
14935
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
067583
93.2%
14935
 
6.8%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
067583
62.1%
.36259
33.3%
14935
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
067583
62.1%
.36259
33.3%
14935
 
4.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35908 
1.0
 
351

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035908
99.0%
1.0351
 
1.0%

Length

2022-05-08T17:23:29.336436image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:29.425455image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035908
99.0%
1.0351
 
1.0%

Most occurring characters

ValueCountFrequency (%)
072167
66.3%
.36259
33.3%
1351
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
072167
99.5%
1351
 
0.5%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
072167
66.3%
.36259
33.3%
1351
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
072167
66.3%
.36259
33.3%
1351
 
0.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35779 
1.0
 
480

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035779
98.7%
1.0480
 
1.3%

Length

2022-05-08T17:23:29.499471image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:29.587086image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035779
98.7%
1.0480
 
1.3%

Most occurring characters

ValueCountFrequency (%)
072038
66.2%
.36259
33.3%
1480
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
072038
99.3%
1480
 
0.7%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
072038
66.2%
.36259
33.3%
1480
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
072038
66.2%
.36259
33.3%
1480
 
0.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
34962 
1.0
 
1297

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.034962
96.4%
1.01297
 
3.6%

Length

2022-05-08T17:23:29.660102image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:29.748122image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.034962
96.4%
1.01297
 
3.6%

Most occurring characters

ValueCountFrequency (%)
071221
65.5%
.36259
33.3%
11297
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071221
98.2%
11297
 
1.8%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071221
65.5%
.36259
33.3%
11297
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071221
65.5%
.36259
33.3%
11297
 
1.2%

health_problem_Stroke
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35827 
1.0
 
432

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035827
98.8%
1.0432
 
1.2%

Length

2022-05-08T17:23:29.823139image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:29.911158image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035827
98.8%
1.0432
 
1.2%

Most occurring characters

ValueCountFrequency (%)
072086
66.3%
.36259
33.3%
1432
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
072086
99.4%
1432
 
0.6%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
072086
66.3%
.36259
33.3%
1432
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
072086
66.3%
.36259
33.3%
1432
 
0.4%

health_problem_Blood Pressure
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
34057 
1.0
 
2202

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.034057
93.9%
1.02202
 
6.1%

Length

2022-05-08T17:23:29.985175image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:30.073195image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.034057
93.9%
1.02202
 
6.1%

Most occurring characters

ValueCountFrequency (%)
070316
64.6%
.36259
33.3%
12202
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
070316
97.0%
12202
 
3.0%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
070316
64.6%
.36259
33.3%
12202
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
070316
64.6%
.36259
33.3%
12202
 
2.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
36203 
1.0
 
56

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.036203
99.8%
1.056
 
0.2%

Length

2022-05-08T17:23:30.148212image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:30.235231image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.036203
99.8%
1.056
 
0.2%

Most occurring characters

ValueCountFrequency (%)
072462
66.6%
.36259
33.3%
156
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
072462
99.9%
156
 
0.1%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
072462
66.6%
.36259
33.3%
156
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
072462
66.6%
.36259
33.3%
156
 
0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35244 
1.0
 
1015

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035244
97.2%
1.01015
 
2.8%

Length

2022-05-08T17:23:30.309853image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:30.396873image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035244
97.2%
1.01015
 
2.8%

Most occurring characters

ValueCountFrequency (%)
071503
65.7%
.36259
33.3%
11015
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071503
98.6%
11015
 
1.4%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071503
65.7%
.36259
33.3%
11015
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071503
65.7%
.36259
33.3%
11015
 
0.9%

health_problem_Heart
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
34923 
1.0
 
1336

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.034923
96.3%
1.01336
 
3.7%

Length

2022-05-08T17:23:30.469889image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:30.555908image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.034923
96.3%
1.01336
 
3.7%

Most occurring characters

ValueCountFrequency (%)
071182
65.4%
.36259
33.3%
11336
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071182
98.2%
11336
 
1.8%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071182
65.4%
.36259
33.3%
11336
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071182
65.4%
.36259
33.3%
11336
 
1.2%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
34976 
1.0
 
1283

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.034976
96.5%
1.01283
 
3.5%

Length

2022-05-08T17:23:30.630925image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:30.718945image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.034976
96.5%
1.01283
 
3.5%

Most occurring characters

ValueCountFrequency (%)
071235
65.5%
.36259
33.3%
11283
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071235
98.2%
11283
 
1.8%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071235
65.5%
.36259
33.3%
11283
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071235
65.5%
.36259
33.3%
11283
 
1.2%

health_problem_Diabetes
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
34672 
1.0
 
1587

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.034672
95.6%
1.01587
 
4.4%

Length

2022-05-08T17:23:30.792961image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:30.879981image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.034672
95.6%
1.01587
 
4.4%

Most occurring characters

ValueCountFrequency (%)
070931
65.2%
.36259
33.3%
11587
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
070931
97.8%
11587
 
2.2%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
070931
65.2%
.36259
33.3%
11587
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
070931
65.2%
.36259
33.3%
11587
 
1.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
36162 
1.0
 
97

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.036162
99.7%
1.097
 
0.3%

Length

2022-05-08T17:23:30.953997image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:31.041016image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.036162
99.7%
1.097
 
0.3%

Most occurring characters

ValueCountFrequency (%)
072421
66.6%
.36259
33.3%
197
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
072421
99.9%
197
 
0.1%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
072421
66.6%
.36259
33.3%
197
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
072421
66.6%
.36259
33.3%
197
 
0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
36187 
1.0
 
72

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.036187
99.8%
1.072
 
0.2%

Length

2022-05-08T17:23:31.115033image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:31.203052image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.036187
99.8%
1.072
 
0.2%

Most occurring characters

ValueCountFrequency (%)
072446
66.6%
.36259
33.3%
172
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
072446
99.9%
172
 
0.1%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
072446
66.6%
.36259
33.3%
172
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
072446
66.6%
.36259
33.3%
172
 
0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
36197 
1.0
 
62

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.036197
99.8%
1.062
 
0.2%

Length

2022-05-08T17:23:31.276069image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:31.361680image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.036197
99.8%
1.062
 
0.2%

Most occurring characters

ValueCountFrequency (%)
072456
66.6%
.36259
33.3%
162
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
072456
99.9%
162
 
0.1%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
072456
66.6%
.36259
33.3%
162
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
072456
66.6%
.36259
33.3%
162
 
0.1%

marijuana_use
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
23685 
1.0
12574 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.023685
65.3%
1.012574
34.7%

Length

2022-05-08T17:23:31.434696image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:31.522715image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.023685
65.3%
1.012574
34.7%

Most occurring characters

ValueCountFrequency (%)
059944
55.1%
.36259
33.3%
112574
 
11.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
059944
82.7%
112574
 
17.3%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
059944
55.1%
.36259
33.3%
112574
 
11.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
059944
55.1%
.36259
33.3%
112574
 
11.6%

marijuana_per_month
Real number (ℝ≥0)

ZEROS

Distinct33
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.230260073
Minimum0
Maximum99
Zeros32756
Zeros (%)90.3%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:31.605734image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile7
Maximum99
Range99
Interquartile range (IQR)0

Descriptive statistics

Standard deviation5.286753676
Coefficient of variation (CV)4.297265099
Kurtosis34.38896561
Mean1.230260073
Median Absolute Deviation (MAD)0
Skewness5.225379535
Sum44608
Variance27.94976443
MonotonicityNot monotonic
2022-05-08T17:23:31.705756image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
032756
90.3%
1721
 
2.0%
30668
 
1.8%
2334
 
0.9%
5221
 
0.6%
3208
 
0.6%
20179
 
0.5%
15167
 
0.5%
25155
 
0.4%
10150
 
0.4%
Other values (23)700
 
1.9%
ValueCountFrequency (%)
032756
90.3%
1721
 
2.0%
2334
 
0.9%
3208
 
0.6%
4129
 
0.4%
5221
 
0.6%
676
 
0.2%
767
 
0.2%
857
 
0.2%
928
 
0.1%
ValueCountFrequency (%)
994
 
< 0.1%
772
 
< 0.1%
30668
1.8%
2932
 
0.1%
2872
 
0.2%
2724
 
0.1%
269
 
< 0.1%
25155
 
0.4%
2427
 
0.1%
2316
 
< 0.1%

cocaine_use
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
31707 
1.0
4552 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.031707
87.4%
1.04552
 
12.6%

Length

2022-05-08T17:23:31.803778image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:31.891798image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.031707
87.4%
1.04552
 
12.6%

Most occurring characters

ValueCountFrequency (%)
067966
62.5%
.36259
33.3%
14552
 
4.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
067966
93.7%
14552
 
6.3%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
067966
62.5%
.36259
33.3%
14552
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
067966
62.5%
.36259
33.3%
14552
 
4.2%

cocaine_number_uses
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3825257178
Minimum0
Maximum6
Zeros32310
Zeros (%)89.1%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:31.959813image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile4
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.21669451
Coefficient of variation (CV)3.180686825
Kurtosis10.51949927
Mean0.3825257178
Median Absolute Deviation (MAD)0
Skewness3.353342361
Sum13870
Variance1.480345532
MonotonicityNot monotonic
2022-05-08T17:23:32.029830image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
032310
89.1%
2956
 
2.6%
3782
 
2.2%
6719
 
2.0%
4698
 
1.9%
5428
 
1.2%
1366
 
1.0%
ValueCountFrequency (%)
032310
89.1%
1366
 
1.0%
2956
 
2.6%
3782
 
2.2%
4698
 
1.9%
5428
 
1.2%
6719
 
2.0%
ValueCountFrequency (%)
6719
 
2.0%
5428
 
1.2%
4698
 
1.9%
3782
 
2.2%
2956
 
2.6%
1366
 
1.0%
032310
89.1%

cocaine_per_month
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct23
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.05822002813
Minimum0
Maximum99
Zeros35837
Zeros (%)98.8%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:32.304890image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum99
Range99
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.111968813
Coefficient of variation (CV)19.09942074
Kurtosis3706.928854
Mean0.05822002813
Median Absolute Deviation (MAD)0
Skewness49.82114522
Sum2111
Variance1.236474641
MonotonicityNot monotonic
2022-05-08T17:23:32.395911image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
035837
98.8%
1159
 
0.4%
276
 
0.2%
345
 
0.1%
538
 
0.1%
421
 
0.1%
611
 
< 0.1%
1011
 
< 0.1%
3011
 
< 0.1%
89
 
< 0.1%
Other values (13)41
 
0.1%
ValueCountFrequency (%)
035837
98.8%
1159
 
0.4%
276
 
0.2%
345
 
0.1%
421
 
0.1%
538
 
0.1%
611
 
< 0.1%
76
 
< 0.1%
89
 
< 0.1%
94
 
< 0.1%
ValueCountFrequency (%)
992
 
< 0.1%
3011
< 0.1%
291
 
< 0.1%
281
 
< 0.1%
252
 
< 0.1%
243
 
< 0.1%
207
< 0.1%
181
 
< 0.1%
161
 
< 0.1%
159
< 0.1%

heroine_use
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35567 
1.0
 
692

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035567
98.1%
1.0692
 
1.9%

Length

2022-05-08T17:23:32.488932image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:32.575951image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035567
98.1%
1.0692
 
1.9%

Most occurring characters

ValueCountFrequency (%)
071826
66.0%
.36259
33.3%
1692
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071826
99.0%
1692
 
1.0%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071826
66.0%
.36259
33.3%
1692
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071826
66.0%
.36259
33.3%
1692
 
0.6%

heronine_per_month
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.02051904355
Minimum0
Maximum30
Zeros36200
Zeros (%)99.8%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:32.644966image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum30
Range30
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.6881248023
Coefficient of variation (CV)33.53591023
Kurtosis1514.204492
Mean0.02051904355
Median Absolute Deviation (MAD)0
Skewness38.06270681
Sum744
Variance0.4735157436
MonotonicityNot monotonic
2022-05-08T17:23:32.724575image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
036200
99.8%
116
 
< 0.1%
3011
 
< 0.1%
206
 
< 0.1%
104
 
< 0.1%
33
 
< 0.1%
43
 
< 0.1%
23
 
< 0.1%
253
 
< 0.1%
152
 
< 0.1%
Other values (5)8
 
< 0.1%
ValueCountFrequency (%)
036200
99.8%
116
 
< 0.1%
23
 
< 0.1%
33
 
< 0.1%
43
 
< 0.1%
52
 
< 0.1%
61
 
< 0.1%
82
 
< 0.1%
104
 
< 0.1%
152
 
< 0.1%
ValueCountFrequency (%)
3011
< 0.1%
281
 
< 0.1%
253
 
< 0.1%
232
 
< 0.1%
206
< 0.1%
152
 
< 0.1%
104
 
< 0.1%
82
 
< 0.1%
61
 
< 0.1%
52
 
< 0.1%

meth_use
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
34468 
1.0
 
1791

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.034468
95.1%
1.01791
 
4.9%

Length

2022-05-08T17:23:32.815594image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:32.901614image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.034468
95.1%
1.01791
 
4.9%

Most occurring characters

ValueCountFrequency (%)
070727
65.0%
.36259
33.3%
11791
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
070727
97.5%
11791
 
2.5%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
070727
65.0%
.36259
33.3%
11791
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
070727
65.0%
.36259
33.3%
11791
 
1.6%

meth_number_uses
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1612289363
Minimum0
Maximum6
Zeros34660
Zeros (%)95.6%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:32.967628image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.8291908009
Coefficient of variation (CV)5.142940344
Kurtosis32.07996574
Mean0.1612289363
Median Absolute Deviation (MAD)0
Skewness5.615153711
Sum5846
Variance0.6875573843
MonotonicityNot monotonic
2022-05-08T17:23:33.038644image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
034660
95.6%
6376
 
1.0%
2341
 
0.9%
3326
 
0.9%
4254
 
0.7%
5153
 
0.4%
1149
 
0.4%
ValueCountFrequency (%)
034660
95.6%
1149
 
0.4%
2341
 
0.9%
3326
 
0.9%
4254
 
0.7%
5153
 
0.4%
6376
 
1.0%
ValueCountFrequency (%)
6376
 
1.0%
5153
 
0.4%
4254
 
0.7%
3326
 
0.9%
2341
 
0.9%
1149
 
0.4%
034660
95.6%

meth_per_month
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct20
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.02986844645
Minimum0
Maximum30
Zeros36111
Zeros (%)99.6%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:33.125664image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum30
Range30
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7163614399
Coefficient of variation (CV)23.98388684
Kurtosis1116.515116
Mean0.02986844645
Median Absolute Deviation (MAD)0
Skewness31.76947474
Sum1083
Variance0.5131737126
MonotonicityNot monotonic
2022-05-08T17:23:33.211683image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
036111
99.6%
142
 
0.1%
224
 
0.1%
519
 
0.1%
311
 
< 0.1%
209
 
< 0.1%
158
 
< 0.1%
307
 
< 0.1%
107
 
< 0.1%
44
 
< 0.1%
Other values (10)17
 
< 0.1%
ValueCountFrequency (%)
036111
99.6%
142
 
0.1%
224
 
0.1%
311
 
< 0.1%
44
 
< 0.1%
519
 
0.1%
62
 
< 0.1%
73
 
< 0.1%
81
 
< 0.1%
107
 
< 0.1%
ValueCountFrequency (%)
307
< 0.1%
281
 
< 0.1%
272
 
< 0.1%
252
 
< 0.1%
241
 
< 0.1%
221
 
< 0.1%
209
< 0.1%
158
< 0.1%
141
 
< 0.1%
123
 
< 0.1%

inject_drugs
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
35602 
1.0
 
657

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.035602
98.2%
1.0657
 
1.8%

Length

2022-05-08T17:23:33.303703image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:33.390326image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.035602
98.2%
1.0657
 
1.8%

Most occurring characters

ValueCountFrequency (%)
071861
66.1%
.36259
33.3%
1657
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071861
99.1%
1657
 
0.9%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071861
66.1%
.36259
33.3%
1657
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071861
66.1%
.36259
33.3%
1657
 
0.6%

rehab_program
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0.0
34987 
1.0
 
1272

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters108777
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.034987
96.5%
1.01272
 
3.5%

Length

2022-05-08T17:23:33.466342image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:33.553362image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0.034987
96.5%
1.01272
 
3.5%

Most occurring characters

ValueCountFrequency (%)
071246
65.5%
.36259
33.3%
11272
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72518
66.7%
Other Punctuation36259
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
071246
98.2%
11272
 
1.8%
Other Punctuation
ValueCountFrequency (%)
.36259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
071246
65.5%
.36259
33.3%
11272
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII108777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
071246
65.5%
.36259
33.3%
11272
 
1.2%

start_smoking_age
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct62
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.379767782
Minimum0
Maximum76
Zeros21426
Zeros (%)59.1%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:33.640382image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q316
95-th percentile23
Maximum76
Range76
Interquartile range (IQR)16

Descriptive statistics

Standard deviation9.524207869
Coefficient of variation (CV)1.290583681
Kurtosis0.3638056222
Mean7.379767782
Median Absolute Deviation (MAD)0
Skewness0.9326817447
Sum267583
Variance90.71053554
MonotonicityNot monotonic
2022-05-08T17:23:33.754407image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
021426
59.1%
182223
 
6.1%
161873
 
5.2%
171500
 
4.1%
151496
 
4.1%
201048
 
2.9%
141028
 
2.8%
19941
 
2.6%
13778
 
2.1%
21611
 
1.7%
Other values (52)3335
 
9.2%
ValueCountFrequency (%)
021426
59.1%
62
 
< 0.1%
772
 
0.2%
870
 
0.2%
9109
 
0.3%
10156
 
0.4%
11151
 
0.4%
12515
 
1.4%
13778
 
2.1%
141028
 
2.8%
ValueCountFrequency (%)
761
 
< 0.1%
741
 
< 0.1%
722
 
< 0.1%
711
 
< 0.1%
681
 
< 0.1%
651
 
< 0.1%
641
 
< 0.1%
631
 
< 0.1%
591
 
< 0.1%
585
< 0.1%

current_smoker
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
0
20789 
1
8359 
3
5780 
2
 
1331

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters36259
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
020789
57.3%
18359
23.1%
35780
 
15.9%
21331
 
3.7%

Length

2022-05-08T17:23:33.862431image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:33.954452image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
020789
57.3%
18359
23.1%
35780
 
15.9%
21331
 
3.7%

Most occurring characters

ValueCountFrequency (%)
020789
57.3%
18359
23.1%
35780
 
15.9%
21331
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number36259
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
020789
57.3%
18359
23.1%
35780
 
15.9%
21331
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
Common36259
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
020789
57.3%
18359
23.1%
35780
 
15.9%
21331
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII36259
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
020789
57.3%
18359
23.1%
35780
 
15.9%
21331
 
3.7%

previous_cigarettes_per_day
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct55
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.48793403
Minimum0
Maximum95
Zeros28368
Zeros (%)78.2%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:34.051055image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile20
Maximum95
Range95
Interquartile range (IQR)0

Descriptive statistics

Standard deviation9.731543943
Coefficient of variation (CV)2.790059634
Kurtosis21.31834141
Mean3.48793403
Median Absolute Deviation (MAD)0
Skewness4.068214845
Sum126469
Variance94.70294752
MonotonicityNot monotonic
2022-05-08T17:23:34.164081image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
028368
78.2%
201967
 
5.4%
101138
 
3.1%
1766
 
2.1%
40555
 
1.5%
5511
 
1.4%
3489
 
1.3%
30423
 
1.2%
2412
 
1.1%
4322
 
0.9%
Other values (45)1308
 
3.6%
ValueCountFrequency (%)
028368
78.2%
1766
 
2.1%
2412
 
1.1%
3489
 
1.3%
4322
 
0.9%
5511
 
1.4%
6238
 
0.7%
7107
 
0.3%
8108
 
0.3%
914
 
< 0.1%
ValueCountFrequency (%)
9536
 
0.1%
907
 
< 0.1%
8034
 
0.1%
753
 
< 0.1%
708
 
< 0.1%
681
 
< 0.1%
652
 
< 0.1%
60178
0.5%
553
 
< 0.1%
531
 
< 0.1%

current_cigarettes_per_day
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct40
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.425770154
Minimum0
Maximum95
Zeros28822
Zeros (%)79.5%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:34.273105image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile20
Maximum95
Range95
Interquartile range (IQR)0

Descriptive statistics

Standard deviation6.64172404
Coefficient of variation (CV)2.73798572
Kurtosis27.23411923
Mean2.425770154
Median Absolute Deviation (MAD)0
Skewness4.177692827
Sum87956
Variance44.11249822
MonotonicityNot monotonic
2022-05-08T17:23:34.375708image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
028822
79.5%
201421
 
3.9%
101308
 
3.6%
1633
 
1.7%
5567
 
1.6%
15472
 
1.3%
3471
 
1.3%
2466
 
1.3%
4379
 
1.0%
6310
 
0.9%
Other values (30)1410
 
3.9%
ValueCountFrequency (%)
028822
79.5%
1633
 
1.7%
2466
 
1.3%
3471
 
1.3%
4379
 
1.0%
5567
 
1.6%
6310
 
0.9%
7212
 
0.6%
8200
 
0.6%
936
 
0.1%
ValueCountFrequency (%)
9512
 
< 0.1%
903
 
< 0.1%
801
 
< 0.1%
701
 
< 0.1%
6018
 
< 0.1%
5013
 
< 0.1%
481
 
< 0.1%
455
 
< 0.1%
431
 
< 0.1%
40180
0.5%

days_quit_smoking
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct144
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1544.790838
Minimum0
Maximum26280
Zeros27934
Zeros (%)77.0%
Negative0
Negative (%)0.0%
Memory size283.4 KiB
2022-05-08T17:23:34.484732image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile10950
Maximum26280
Range26280
Interquartile range (IQR)0

Descriptive statistics

Standard deviation3828.916153
Coefficient of variation (CV)2.478598435
Kurtosis7.161980489
Mean1544.790838
Median Absolute Deviation (MAD)0
Skewness2.758623846
Sum56012571
Variance14660598.91
MonotonicityNot monotonic
2022-05-08T17:23:34.603356image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
027934
77.0%
7300518
 
1.4%
10950472
 
1.3%
3650431
 
1.2%
5475406
 
1.1%
9125324
 
0.9%
1825312
 
0.9%
14600300
 
0.8%
730286
 
0.8%
1095284
 
0.8%
Other values (134)4992
 
13.8%
ValueCountFrequency (%)
027934
77.0%
19
 
< 0.1%
27
 
< 0.1%
35
 
< 0.1%
47
 
< 0.1%
53
 
< 0.1%
61
 
< 0.1%
731
 
0.1%
81
 
< 0.1%
103
 
< 0.1%
ValueCountFrequency (%)
262801
 
< 0.1%
259151
 
< 0.1%
255501
 
< 0.1%
244551
 
< 0.1%
240905
 
< 0.1%
237255
 
< 0.1%
229952
 
< 0.1%
226302
 
< 0.1%
222651
 
< 0.1%
2190020
0.1%

household_smokers
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size283.4 KiB
4
17592 
0
10780 
1
4721 
2
2452 
3
 
714

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters36259
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row4
3rd row4
4th row1
5th row4

Common Values

ValueCountFrequency (%)
417592
48.5%
010780
29.7%
14721
 
13.0%
22452
 
6.8%
3714
 
2.0%

Length

2022-05-08T17:23:34.705379image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-08T17:23:34.800401image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
417592
48.5%
010780
29.7%
14721
 
13.0%
22452
 
6.8%
3714
 
2.0%

Most occurring characters

ValueCountFrequency (%)
417592
48.5%
010780
29.7%
14721
 
13.0%
22452
 
6.8%
3714
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number36259
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
417592
48.5%
010780
29.7%
14721
 
13.0%
22452
 
6.8%
3714
 
2.0%

Most occurring scripts

ValueCountFrequency (%)
Common36259
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
417592
48.5%
010780
29.7%
14721
 
13.0%
22452
 
6.8%
3714
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII36259
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
417592
48.5%
010780
29.7%
14721
 
13.0%
22452
 
6.8%
3714
 
2.0%

Correlations

2022-05-08T17:23:35.245500image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-05-08T17:23:37.159685image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-05-08T17:23:39.085910image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-05-08T17:23:41.035133image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2022-05-08T17:23:43.076773image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-05-08T17:22:59.650822image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-05-08T17:23:05.229010image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

SEQNdepressiongenderageracecitizenshipeducation_levelmarital_statushousehold_sizepregnantbirth_placeveteranhousehold_incomeasthmaasthma_onsetasthma_currentlyasthma_emergencyanemiaever_overweightblood_transfusionarthritisheart_failureheart_diseaseanginaheart_attackstrokeemphysemabronchitisliver_conditionthyroid_problembronchitis_currentlyliver_condition_currentlythyroid_problem_currentlycancerfirst_cancer_typesecond_cancer_typethird_cancer_typefourth_cancer_countheart_attack_relativeasthma_relativediabetes_relativehay_feverarthritis_onsetheart_failure_onsetheart_disease_onsetangina_onsetheart_attack_onsetstroke_onsetemphysema_onsetbronchitis_onsetliver_condition_onsetthyroid_problem_onsetcancer_onsetarthritis_typefirst_cancer_countsecond_cancer_countthird_cancer_countweightheightBMIpulseirregular_pulsesystolicdiastolicalbuminALTASTALPBUNcalciumCO2creatinineGGTglucoseironLHDphosphorusbilirubintotal_proteinuric_acidsodiumpotassiumchlorideosmolalityglobulintrouble_sleeping_historysleep_hoursvigorous_recreationmoderate_recreationsedentary_timevigorous_workmoderate_workdrinks_per_occasionlifetime_alcohol_consumptiondrinks_past_yearcant_worklimited_workwalking_equipmentmemory_problemslimitationshealthcare_equipmenthealth_problem_Other Impairmenthealth_problem_Bone or Jointhealth_problem_Weighthealth_problem_Back or Neckhealth_problem_Arthritishealth_problem_Cancerhealth_problem_Other Injuryhealth_problem_Breathinghealth_problem_Strokehealth_problem_Blood Pressurehealth_problem_Mental Retardationhealth_problem_Hearinghealth_problem_Hearthealth_problem_Visionhealth_problem_Diabeteshealth_problem_Birth Defecthealth_problem_Senilityhealth_problem_Other Developmentalmarijuana_usemarijuana_per_monthcocaine_usecocaine_number_usescocaine_per_monthheroine_useheronine_per_monthmeth_usemeth_number_usesmeth_per_monthinject_drugsrehab_programstart_smoking_agecurrent_smokerprevious_cigarettes_per_daycurrent_cigarettes_per_daydays_quit_smokinghousehold_smokers
031131Not Depressed144BlackCitizen4Married4NoUSANo110.00.00.00.00.00.01.00.01.00.00.00.00.00.00.00.00.00.00.00.00.0NoneNoneNone0.01.00.01.00.00.042.00.00.00.00.00.00.00.00.00.0Missing0.00.00.075.2156.030.9058.00.0144.074.03.514.016.074.06.08.923.00.817.087.051.0105.03.40.46.94.9137.04.1106.0271.03.40.09.0NoNo150.00.00.00.0Missing0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000004
131132Not Depressed070WhiteCitizen5Married2MissingUSAYes110.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.01.00.00.01.00.0NoneNoneNone0.00.00.00.00.00.00.00.00.00.00.00.00.00.063.00.0Missing0.00.00.069.5167.624.7462.00.0138.060.05.031.029.048.025.09.929.01.222.0155.089.0165.03.41.07.27.2140.03.8102.0287.02.20.07.0NoYes150.00.00.00.0Missing0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000004
231134Not Depressed073WhiteCitizen3Married2MissingUSAYes50.00.00.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NoneNoneNone0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0Missing0.00.00.0101.9182.430.6350.00.0130.068.03.930.031.077.013.09.527.01.233.093.084.0158.03.30.57.17.5139.04.1103.0277.03.21.07.0NoYes90.00.01.00.0Missing0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000004
331139Not Depressed118Other HispanicCitizen0Never Married3NoUSANo111.01.00.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NoneNoneNone0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0Missing0.00.00.073.9158.429.4568.00.0110.064.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.06.0YesYes120.00.00.00.0Missing0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000001
431143Not Depressed019WhiteCitizen0Never Married3MissingUSANo110.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NoneNoneNone0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0Missing0.00.00.076.4184.022.5768.00.0108.062.04.722.023.047.012.010.228.00.813.076.0158.093.04.51.07.24.3143.04.1104.0283.02.50.07.0YesNo360.00.00.00.0Missing0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000204
531144Not Depressed021Other HispanicCitizen3Never Married6MissingUSANo30.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NoneNoneNone0.00.01.01.00.00.00.00.00.00.00.00.00.00.00.00.0Missing0.00.00.069.9167.125.0354.00.0116.074.04.115.018.0183.09.010.327.00.715.081.0114.0120.04.70.77.36.0139.04.4103.0275.03.20.08.0YesYes180.00.00.00.0Missing0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000004
631149Not Depressed185WhiteCitizen2Widowed1MissingUSANo10.00.00.00.00.00.01.01.00.00.00.00.00.00.00.00.01.00.00.01.01.0ColonNoneNone0.00.00.00.00.077.00.00.00.00.00.00.00.00.043.00.0Osteoarthritis1.00.00.051.9154.921.6364.00.0110.00.03.719.028.050.019.09.023.01.112.092.076.0153.04.30.46.35.9140.03.8105.0281.02.60.08.0NoNo300.00.00.00.0Missing0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000004
731150Not Depressed079WhiteCitizen3Divorced1MissingUSAYes30.00.00.00.00.00.01.01.00.00.00.00.00.00.00.00.00.00.00.00.01.0ColonNoneNone0.00.00.00.00.058.00.00.00.00.00.00.00.00.00.00.0Other1.00.00.085.0171.428.9366.00.0144.074.04.123.028.067.010.09.523.00.916.094.0142.0150.03.61.37.54.8139.04.0103.0276.03.40.08.0YesYes90.01.00.00.0Missing0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0141200138704
831151Not Depressed159BlackCitizen3Married2NoUSANo71.044.01.00.00.01.00.01.00.00.00.00.01.00.01.00.01.01.00.01.00.0NoneNoneNone0.01.00.01.00.00.00.00.00.00.00.00.00.00.054.00.0Rheumatoid0.00.00.082.9167.629.5156.00.0136.076.03.713.019.099.016.09.126.00.914.082.054.0125.04.00.56.95.4140.04.0108.0280.03.21.02.0NoNo420.00.00.00.0Missing0.00.01.00.00.00.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0251600105854
931152Not Depressed127MexicanCitizen3Married5YesMexicoNo70.00.00.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NoneNoneNone0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0Missing0.00.00.094.1153.639.8878.01.094.062.02.810.014.072.06.09.119.00.68.072.087.088.04.30.36.44.2139.03.7108.0274.03.60.08.0YesYes180.00.00.00.0Missing0.00.00.00.01.00.00.01.00.01.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000004

Last rows

SEQNdepressiongenderageracecitizenshipeducation_levelmarital_statushousehold_sizepregnantbirth_placeveteranhousehold_incomeasthmaasthma_onsetasthma_currentlyasthma_emergencyanemiaever_overweightblood_transfusionarthritisheart_failureheart_diseaseanginaheart_attackstrokeemphysemabronchitisliver_conditionthyroid_problembronchitis_currentlyliver_condition_currentlythyroid_problem_currentlycancerfirst_cancer_typesecond_cancer_typethird_cancer_typefourth_cancer_countheart_attack_relativeasthma_relativediabetes_relativehay_feverarthritis_onsetheart_failure_onsetheart_disease_onsetangina_onsetheart_attack_onsetstroke_onsetemphysema_onsetbronchitis_onsetliver_condition_onsetthyroid_problem_onsetcancer_onsetarthritis_typefirst_cancer_countsecond_cancer_countthird_cancer_countweightheightBMIpulseirregular_pulsesystolicdiastolicalbuminALTASTALPBUNcalciumCO2creatinineGGTglucoseironLHDphosphorusbilirubintotal_proteinuric_acidsodiumpotassiumchlorideosmolalityglobulintrouble_sleeping_historysleep_hoursvigorous_recreationmoderate_recreationsedentary_timevigorous_workmoderate_workdrinks_per_occasionlifetime_alcohol_consumptiondrinks_past_yearcant_worklimited_workwalking_equipmentmemory_problemslimitationshealthcare_equipmenthealth_problem_Other Impairmenthealth_problem_Bone or Jointhealth_problem_Weighthealth_problem_Back or Neckhealth_problem_Arthritishealth_problem_Cancerhealth_problem_Other Injuryhealth_problem_Breathinghealth_problem_Strokehealth_problem_Blood Pressurehealth_problem_Mental Retardationhealth_problem_Hearinghealth_problem_Hearthealth_problem_Visionhealth_problem_Diabeteshealth_problem_Birth Defecthealth_problem_Senilityhealth_problem_Other Developmentalmarijuana_usemarijuana_per_monthcocaine_usecocaine_number_usescocaine_per_monthheroine_useheronine_per_monthmeth_usemeth_number_usesmeth_per_monthinject_drugsrehab_programstart_smoking_agecurrent_smokerprevious_cigarettes_per_daycurrent_cigarettes_per_daydays_quit_smokinghousehold_smokers
36249102934Not Depressed048BlackCitizen3Married3MissingUSANo90.00.00.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NoneNoneNone0.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.0Missing0.00.00.083.7178.526.30.00.00.00.03.812.016.059.014.09.128.01.3028.089.096.0191.03.60.47.36.0141.03.8100.0281.03.50.07.5NoNo120.01.00.03.0Yes52.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.01.00.00.00.00.00.00.00.00.00.00.00.024301202
36250102935Depressed127WhiteCitizen5Partner2NoUSANo111.015.00.00.00.01.00.01.00.00.00.00.00.00.00.00.00.00.00.00.00.0NoneNoneNone0.00.00.00.00.027.00.00.00.00.00.00.00.00.00.00.0Rheumatoid0.00.00.059.5158.923.682.00.0116.064.00.00.00.00.00.00.00.00.000.00.00.00.00.00.00.00.00.00.00.00.00.01.08.0YesYes720.01.00.01.0Yes182.01.01.00.01.00.01.01.00.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.00.01.00.00.00.00.00.00.00.00.00.00.00.0000002
36251102943Not Depressed148MexicanCitizen4Partner6MissingUSANo121.041.00.00.00.01.00.00.00.00.00.00.00.00.00.01.00.00.01.00.00.0NoneNoneNone0.00.00.01.00.00.00.00.00.00.00.00.00.048.00.00.0Missing0.00.00.0122.7181.137.474.00.0134.080.03.931.030.092.016.09.624.00.6526.086.050.0161.04.00.37.55.5143.04.0102.0285.03.60.05.5NoNo60.01.01.02.0Yes5.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000
36252102944Not Depressed055Other and MultiracialCitizen3Married4MissingMexicoNo111.042.01.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NoneNoneNone0.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.0Missing0.00.00.078.8163.129.682.00.00.00.04.139.031.044.014.09.226.00.8349.093.075.0183.03.30.57.76.8143.03.1100.0285.03.60.06.0NoNo600.00.01.03.0Yes12.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000
36253102947Not Depressed075BlackCitizen5Divorced1MissingUSANo80.00.00.00.00.01.01.00.00.00.00.00.00.00.00.00.00.00.00.00.01.0ProstateNoneNone0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.068.0Missing1.00.00.073.2170.425.260.00.0160.082.04.522.026.064.025.09.524.01.8126.077.085.0469.03.60.67.76.8142.04.9104.0286.03.20.07.0NoYes240.00.00.01.0Yes5.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.019150127750
36254102949Not Depressed033WhiteCitizen3Partner5MissingUSANo60.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NoneNoneNone0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0Missing0.00.00.076.9180.123.796.00.0120.072.04.526.029.0104.020.09.225.01.0514.096.097.0185.04.50.37.35.4143.05.3103.0287.02.80.06.5NoNo60.01.01.00.0No0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.01.05.01.03.00.00.00.01.02.00.00.01.018302002
36255102952Not Depressed170Other and MultiracialCitizen3Married2MissingMexicoNo40.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NoneNoneNone0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0Missing0.00.00.049.0156.520.068.00.0136.074.04.822.027.057.013.09.927.00.7018.0150.0140.0168.03.71.27.46.4143.04.1100.0288.02.60.08.5NoYes120.00.00.00.0Yes0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000
36256102953Not Depressed042MexicanNot Citizen3Separated1MissingMexicoNo51.042.01.00.00.01.00.00.00.00.00.00.00.00.00.01.00.00.00.00.00.0NoneNoneNone0.00.01.00.01.00.00.00.00.00.00.00.00.034.00.00.0Missing0.00.00.097.4164.935.878.00.0124.076.04.040.029.0115.017.09.024.00.9228.0101.070.0136.03.50.67.55.8144.03.8106.0289.03.51.06.0NoNo360.01.01.012.0Yes30.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0201102700
36257102954Not Depressed141BlackCitizen5Never Married7NoUSANo100.00.00.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NoneNoneNone0.00.01.01.00.00.00.00.00.00.00.00.00.00.00.00.0Missing0.00.00.069.1162.626.178.00.0116.066.03.96.015.055.08.09.021.00.698.088.020.0123.03.50.26.53.1137.03.6101.0272.02.60.08.0NoYes600.00.00.00.0No0.00.01.00.00.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000
36258102956Not Depressed038WhiteCitizen4Divorced5MissingUSANo70.00.00.00.00.01.00.01.00.00.00.00.00.00.00.00.00.00.00.00.00.0NoneNoneNone0.00.00.00.00.027.00.00.00.00.00.00.00.00.00.00.0Osteoarthritis0.00.00.0111.5175.836.176.00.0150.098.04.347.027.084.012.09.727.00.8272.090.0101.0134.03.00.37.26.4139.04.399.0277.02.91.08.0NoNo720.00.00.02.0Yes9.01.01.01.00.00.00.01.01.00.00.01.00.00.00.00.00.00.00.00.00.00.00.00.00.01.00.00.00.00.00.00.00.00.00.00.00.016302001